The 4 key User Experience modes of Augmented Reality
AR is already a very diverse field and it seems like there are new tools and examples springing up every day. However within all of this chaos and rapid change there are 4 key physical modes that can be used to classify these user experiences. There are some notable examples that don't fit into these 4 modes, however these are more exploratory and fringe examples that have not as yet been widely distributed or adopted. The modes outlined below are the dominant forms that seem to appear again and again throughout the AR landscape. While these modes closely relate to the differences between computer vision based AR and locative AR, we believe this location/vision based divide Is dissolving rapidly and the spatial relationships of these ARUX modes outlined below will remain. However we will revisit this post on a regular basis and look forward to your feedback and discussion so please add your comments below.
OverviewThese 4 modes are based on the physical relationship between the user, the camera, the camera's field of view (fov) and the visual/auditory display. These relationships are primarily based upon the visual and auditory sensory channels as these also dominate the current AR experiences with visual by far leading the way. NOTE: Experimentation across the other sensory channels is an interesting area that is definitely open for new UX research, especially in terms of accessibility. But of all the structural aspect of these modes, the relationship between the user and other people are the most important. It is this relationship that has been used to define the names for each of these modes. Public
The public ARUX mode is commonly presented within a retail, installation art, public kiosk or family lounge environment. The Kinect game platform is an example of the new breed of AR that's quickly moving into our lounges. The camera is often encased within a pedestal, tower, wall mounted box, set top box or point of sale display. This camera (or sometimes multiple cameras) have a wide field of view that the user can interact with, often expanding to the user's whole body. The AR image output can be presented using a projector, large flat screen display or tv. These displays are often on or up against a wall and tend to naturally create a sense of "experience space" in front of them. This structure or set of relationships creates what tends to feel like a public experience which often draws a crowd or in some cases even extra participants. It is often used to gain people's attention and create a buzz, or at home can create a sense of group participation - it is at all levels a shared experience. The public ARUX mode is dominated by computer vision based AR and because of it's installation format has a fairly limited distribution, however gaming platforms are quite likely to change this. Intimate
The intimate ARUX mode is the dominant form seen on youtube or vimeo and commonly involves a desktop or laptop PC with a webcam. This really exploded with the release of the Adobe Flash based FLARToolkit software which uses fiduciary markers and is based upon the seminal ARToolkit. The experiences themselves are usually used by a single individual sitting in front of the PC with one or two other spectators sitting next to the user or standing behind them. The camera is often the built-in device at the top of the laptop screen or an external device designed to sit on top of the monitor. Many examples also show users moving these external cameras around, however they are still usually tethered by a cable. It will be interesting to see how the spread of wireless cameras changes this spatial dynamic. The display is usually the built-in laptop screen or a desktop monitor so is commonly between 15 and 24 inches, which can limit the image quality and size. This obviously has an impact on the immersiveness of experience, but is an environment that most internet users are now very comfortable with. The structure of this experience is quite similar to web browsing, however it would probably benefit more from leaning closer to the PC gaming experience. It is often a single user experience and there are very few if any examples of experiences that create long or ongoing engagement. It is almost exclusively based upon fiduciary markers and computer vision software, however with the release of browser based applications using FLARToolkit or similar it has benefited from the massive distribution delivered by the web. The big challenge is that it also often requires that the user prints out a marker, image or already has a magazine or printed promotional material. While a large number of users on the internet are now aware of, or even comfortable with this ARUX mode the majority of them seem to have simply watched videos of these applications rather than actually printing out the markers and using them themselves. Personal
The personal ARUX mode has spread rapidly along with the recent rapid adoption of camera, compass and GPS enabled smartphones. According to some market assessments the number of smartphones have now surpassed the number of PCs and look set to becoming the most widely distributed internet access method. The modern smartphone is a very personal device and the small screen format makes for a very personal media experience. Smartphones now come with both forward and rearward facing cameras. By far the most broadly used experience is using the rearward facing camera in Mobile AR browsers like Layar, Junaio and Wikitude. Tablets like the iPad2 now come with 2 cameras and are likely to impact this mode heavily. All of the major Mobile AR browsers support locative based AR utilising the GPS and orientation/compass sensors in the phone. Junaio Glue is also notable in it's use of computer vision based Natural Feature Tracking that uses the camera to recognise relatively natural images. Some applications are now making good use of the front facing camera to augment the view of your face or even hands for gestural input. Smartphones currently provide a "keyhole" AR experience with limited image quality, however the iPhone4 retina display has pushed the image quality to a whole new level and this trend will only continue. The rapidly growing new segment of tablet based computers are also helping to stretch this "keyhole" into a "window" and may soon even justify adding a new ARUX mode of their own. The structure of this experience creates a very personal and pervasive feeling. It is always in your pocket and with 3G and wifi constantly available it provides an "always on" platform. However, the need to constantly hold the phone and now tablet up can be quite tiring for some users. It also can create quite a sense of social stigma in public places or large crowds. With Layar now being used by millions of users and Junaio, Wikitude and the wide range of other developing Mobile AR apps the Personal ARUX mode may currently be the most broadly adopted of all the modes. The combination of rapidly improving built-in sensors, faster mobile broadband speeds, bigger better quality screens (including the new Tablets) and lower priced devices mean that this trend is likely to continue for some time. Private
The private ARUX mode is based upon the newly developing wearable or head mounted displays from companies such as Vuzix. While these devices are now at the "early adopter consumer" level price point, there are still a number of building blocks and integration steps required to really achieve this type of experience. This largely limits it to R&D labs and academic institutions. However, it is possible that this could change quite quickly over the next 1-2 years. Combining this with smartphones is likely to accelerate the rate of adoption once the price is reduced to a suitable level. The camera or cameras are usually mounted on the head as part of, or attached to the wearable display. There are a wide range of issues in terms of smoothly capturing images, encoding them and sending them to the smartphone or PC, combining them with the augmented information and then returning them to the display without creating a significant lag. The displays currently come in two main formats. Either "video see through" where the images from the camera are presented on small displays in front of the users eyes. Or "optical see through" where the user looks directly at the real world environment and images are presented upon transparent/translucent lenses. Video see through devices currently have issues in terms of field of view, resolution, energy requirements and refresh rate. Optical see through devices face challenges with occlusion (e.g. completely blocking elements from the real world) and image registration. However, both fields are making rapid advances. On top of this there are a range of other technologies that are currently further out including contact lense based displays and retina projectors. All of these devices are usually used in conjunction with "6 degrees of freedom" or orientation tracking sensors. The structure of this experience has the potential to be the most immersive of all the ARUX modes, however it is currently the most invasive and often involves many cables and attachments. This two is rapidly changing and improving. Based on the cost, technical skills required to integrate all the elements and the physical size and nature of these wearable displays this is currently the least distributed of all the ARUX modes. However, it is likely to be the one that most fully integrates locate AR and computer vision based AR into a truly engaging and immersive experience. It's also certainly the most fun to experiment with.
Summary
This analysis is designed to be a foundation resource for our community and we hope it will stimulate some interesting and ongoing discussion. Based on that discussion we'll review the structure and content as well as updating it as the market evolves. So let us know what you think by sharing your perspective in the comments below.
