The Definition of Augmented Reality

Member for

4 years 5 months
Submitted by SuperSayu on Thu, 08/10/2017 - 15:30

Having listened for some time to tech commentators talk about the potential of augmented reality glasses (commonly known by the acronym AR), I'd like to offer a deceptively simple litmus test to help us determine when that technology has finally arrived. Because, spoiler alert, I don't believe that what we have *is* AR; it's simply a heads-up display or a head-mounted display that shows content that we already have. For real augmented reality, you must generate a new reality based on the real world around you. In order to be as fair as possible, this litmus test is achievable today, but so technically difficult that even if someone were to make such a display, it would not be consumer-ready.

That litmus test is this: a heads-up display which informs you, no matter what you are looking at within your field of view, how far the object you are looking at is from you.

The test is both technically possible but also deceptive. Arguably the easiest way to answer the question would be for you to simply determine the focal length of a person's gaze, but that serves only as a first-order estimate. Range finding with lasers is the most precise way to get distance, but using that at an arbitrary angle relative to the orientation of the viewer's head (I said no matter where in your field of view you looked, so I assume the viewer does not need to turn their head) requires additional mechanical pieces. Determining distance from a composite of several cameras is a well-known technique, but it requires image processing from several cameras (including the cameras necessary to track eye movements), and may require the ability to change the focal length of those cameras to deal with very close or very distance objects.

I call this test "The Definition of Augmented Reality" because until you have a wearable that both interacts responsively with the user, and adapts to changing real-world conditions (monitoring the distance to a car as it drives towards or away from you), anything else you might have can be done as well or better with non-AR technology. For example, the much-vaunted Microsoft Hololens Minecraft demonstration from E3 2015, which shows a virtual world on a tabletop, clearly shows what the technology is capable of. And yet, that demonstration is only pretending to be another technology, one that projects 3D images into thin air. It is interesting, but it serves only to be a display; you could as easily do the same demonstration in virtual reality (VR), where you cannot see the real world behind the virtual model, and you can much MORE easily see the same model with existing user interface paradigms, on a flat display with a controller or mouse. Until a wearable display creates a reality that cannot exist without it, you are not creating an "augmented reality," you are just filling reality *with* things.

Consider the ramifications of my test; presume that the technology was created and worked perfectly. Knowing the distance to something is incredibly important to military and police, to construction workers, inspectors, surveyors, explorers, as well as people who are amateurs at doing each of these things. After someone gets used to this technology, they will have an intuitive understanding of how far away things are, not because they guessed and checked it out later, but because they have gotten used to actually being able to tell *by looking* how far away something is; this was always calculable given the information the human body has, but we were never in a position to train ourselves that precisely. A generation of people might grow up being able to tell at a glance that a street corner is two hundred feet away, whereas today we turn to technology far beyond ourselves to do what we have always been capable of.

Now consider that once we pass the test, then we begin to really be *able* to augment reality where we had not before. Now that you know the distance to one thing, know the relative speed between you, or the distance between two points, or the relative height of something based on your head and eye angle. Now that you have a platform that knows what object you are looking at, parse information about the target. If the system is tied into GPS or other positioning systems, you can show what you are looking at to other people, or see exactly where that thing is on a map. You are no longer limited to displaying information about the user; you can display information about the *target* in a way that had not previously existed.

It all starts with seeing what the user sees, the way the user sees it--with their own two eyes. Only then can you really begin to be part of the user's thoughts, be part of their reality. Until then, reality is what it is, and technology is only a simple part of it.