Thank your for your subscribe
Oops something went wrong. Please check your entry

Virtual reality: what to expect in the next 20 years?

Immersive technologies have entered many professional sectors. Will they, beyond a few leisure applications, reinvent our daily lives? Over the next twenty years, a number of technical developments could accelerate their adoption. But we must not imagine that virtual reality is breaking at the speed of the digital wave, because its uses strongly imply the sensorimotor behavior of human beings.

April 2018
lire en français
lire en français
Executive Summary

With the democratization of virtual reality (VR), augmented reality (AR) and mixed reality, uses are no longer reserved for professionals. There are new applications for recreational or artistic activities. Personal and everyday applications are also possible, but less obvious. The forecasts for them seem very optimistic, especially about wearing a pair of RA glasses or a RA helmet. Recreational activities, on the other hand, will easily exploit RV+ techniques in amusement parks and arcades (RV+: virtual reality, augmented reality and mixed reality). But VR games will compete with traditional video games, even when headsets are cheap and accessible. The same could be said of immersive social networks, competing with standard social networks.

The uses of VR+ techniques in the professional fields (Industry, Health, Training, Education, Urban Planning, Architecture, etc.) are well referenced and established for at least ten years. The media frenzy will encourage all companies in the above-mentioned sectors and those in new sectors (marketing, retail, leisure, media, etc.) to at least consider exploiting these techniques, whose investment cost is lower than ten years ago.

What can we expect from technological developments over the next ten or twenty years, making it possible to broaden the spectrum of uses?

Here are the main technical evolutions that we can hope for in 20 years, allowing to improve applications or to propose new uses that are not yet possible.

Visual interfacing

Future headsets need to be improved on their five main weaknesses :

. a better visual quality of the screens, with a much higher resolution (the main current defect);

. horizontal and vertical fields of vision identical to human vision;

. eye tracking;

. adaptive accommodation at pixel level ;

. improved freedom of movement by removing the cables connected to the computer.

It will take at least a quarter of a century to have headsets providing all these improvements simultaneously. We've already waited the same length of time for helmet prices to come down! For RA video headsets, improvements can be expected with Magic Leap and Hololens products. Their main progress is not adaptive accommodation in visual restitution, but the 3D scanner, performing real-time 3D reconstruction and analysis of this environment. The objective of real-time scans is to allow a correct visual integration of virtual entities (the so-called "holograms") with the real world.

Currently, there are understandable restrictions and 3D reconstruction is limited to geometrically structured and simple environments, composed of simple geometrical objects: plane, cylinder, etc. We are not close to being able to simulate in augmented reality a virtual sea urchin coming into contact with a real hedgehog... The pair of Magic Leap glasses will offer adaptive accommodation in three planes at different depths, according to the latest information. Its horizontal field of view will be a priori a little larger than that of the Hololens for the display of virtual entities, but this remains too weak for certain applications. Unlike the Hololens which is a visor helmet, the pair of Magic Leap glasses has the disadvantage of having frames that completely obscure peripheral vision on the real world.

There are projects for high resolution video headsets. Google has a project to have 20 Megapixels per eye. The Finnish start-up Varjo Technologies is developing a display technology featuring 70 Megapixels, operating OLED Full HD microscreens. The helmet would have two types of display per eye: a conventional background display, 1080×1200 pixels at 90 Hz frequency, and a 1920×1080 pixels microscreen, which would display a high resolution image at the point where the optical axes of the moving eyes would point. The optical solution is not known and may be based on micro mirrors. The difficulty is to achieve this visual restitution with very little latency time because the eye movements are extremely fast. Video-headsets made from smartphones have no medium-term future.

Regarding components, research is being undertaken to have more effective optical lenses, while being less cumbersome in thickness. We know that the main difficulty in the design of a headset concerns the optical system, more than the quality of the screens. Lenses with flat surfaces and nanostructures concentrate light. A Harvard John A Paulson School of Engineering and Applied Science team has developed prototypes. For future headsets, it will be necessary to increase the power of the graphics cards to display images at higher resolution. Graphics card manufacturers, such as Nvidia, are working on it and should provide the correct displays without too much difficulty, depending on the increase in screen resolution. Manufacturers are developing stand-alone or at least wireless headsets with sufficient wireless communication. Other manufacturers, such as the company Royole Corporation, develop video headsets to see a classic film, using flexible screens of quite high resolution (density of 3000 pixels per inch). The use can be justified in mobility, as offered to airline passengers by the Skylight company, which allows watching movies individually during the flight.

Regarding the evolution of smartphones compared to augmented reality, the Red company, a manufacturer of cameras, has developed, thanks to nanotechnology components, a smartphone with a holographic display: compared to a conventional smartphone, the observer sees objects better in three dimensions thanks to binocular vision and especially, thanks to the change in the observer's point of view, during screen rotation (monoscopic index: parallax change).

Regarding the visual interfaces for augmented reality, it is illusory to imagine contact lenses that would dynamically display synthetic images with the right point of view, as in the science fiction movie Minority Report. Even if a company has registered patents for this objective, there are too many insurmountable technological difficulties, with in addition, the very strong constraints of the eyes in movement by very fast jerks and in continuous micromovements when they point a precise zone.

Regarding the evolution of CAVEs (3D immersive cubes) and other devices on large screens, we should not expect a rapid drop in prices of these devices, even if companies market CAVES at tens of thousands of euros, allowing SMEs to invest in such equipment. It may be necessary to wait, in the medium or long term, for the end of the video-projection technique, replaced by that exploiting large flat screens, rigid or possibly flexible, to obtain a significant fall in prices.

Capturing a real environment

There is research and development on volumetric 360° cameras that allow partial 3D reconstruction. This type of capture allows you to see the scene better, by moving slightly, from left to right, from bottom to top, with respect to your original position. If you want to have the same possibilities of moving and manipulating objects as with a virtual environment created in computer graphics, you have to go through a 3D reconstruction of the real environment that was filmed by the 360° camera. There are several solutions, one of which consists in coupling a 3D laser scanner to the 360° camera. But this only allows to move in the 3D environment, not populated with filmed actors or virtual characters modeled in 3D, which is another great difficulty (see next paragraph).

Modeling and animation of avatars or virtual characters

In virtual reality, the term avatar refers only to the representation of the user immersed in the virtual environment. The role of an avatar can be exploited to allow the user to:

. interact more easily in the environment, because the user, seeing his avatar, has a visual feedback of his sensorimotor actions. Consequently, the visualization of the avatar can help you to better execute your actions, when you are not co-located with your avatar;

. better communicate with other people also immersed and represented by their avatar;

. to provide you with a representation of your body, of your body pattern, with a view to psychological effects, in particular during the implementation of virtual therapies, where the body pattern has an important psychological impact for rehabilitation.

Modeling an avatar or a virtual character is not simple. There is still a lot of research and development work to do before any developer has access to a “ready to use” virtual character library.

The difficulties of modeling and animating a virtual character are at four levels: biomechanical modeling (sensorimotor), cognitive modeling providing AI to understand a situation, behavioral modeling allowing the creation of autonomous character to perform actions of itself (movement, manipulation and communication via an artificial dialogue), as well as the modeling of emotions, transcribed by the face and gestures. With such a list, we easily understand that the difficulties are very great and that it will still take years to have characters modeled and animated in real time, with all the characteristics required for any RV application. They will also have to be autonomous in their actions, in the face of user behavior, thanks to the development of algorithms based on AI techniques.

For biomechanical modeling, there are now effective solutions, but they require means and time to make the avatar of one or two people at the same time, as realized by Microsoft. Concerning the modeling and animation of a face, the French company Eisko with its technical device composed of a large number of cameras, shows what can be achieved so far: real time facial expressions are now photorealistic. A character can then be animated in real time in virtual reality software. With limited technical means, it is possible to model a face or simple objects, with only a smartphone. The result is obviously of poorer visual quality, due, among other things, to the lack of lighting control. This solution is offered on some smartphones, including those from Sony, for use in augmented reality.

Behavioral interfaces

Beyond video-headsets, the evolution of other sensory, motor or sensorimotor interfaces will be slow, as has been the case for a quarter of a century. No, the evolution of technologies is not going faster and faster... The technological locks are now well referenced. Prices for certain types of interfaces will gradually decrease, except for force feedback interfaces and motion stimulation interfaces, which will remain beyond the reach of personal purchase for home use.

One should not confuse the force feedback interfaces capable of exerting effort on the user's body (mainly the hands), allowing movements to be blocked, with the “haptic combinations” to be put on by the user, which create sensations of effort on the muscles by electrostimulation. This type of combination does not provide external stress to the body and, therefore, does not block movement, such as the Tesla combination under development.

The manipulation of a virtual object is easily done with any controller. On the other hand, if one wants a manipulation with the fingers, creating the real haptic sensations (tactile and feedback of internal and external efforts), the difficulties in mechatronics to achieve such a glove with tactile feedback and realistic efforts on all fingers, are huge. It will probably never be possible to make a perfect glove, no matter how the techniques evolve. Technical compromises will always have to be made. On the other hand, tactile gloves that provide only mechanical touch returns, and possibly thermal variation, are easily conceivable and effective, even if the tactile stimuli are approximate and unrealistic. They must find their economic market to be sold in greater numbers, which is not obvious because their actual uses are limited.

Some dream of exploiting Brain-Machine Interfaces (BMIs), providing a spectacular effect to move virtually or to manipulate virtual objects by thinking. But beyond this surprising effect, BMIs will be of little use for RV+ applications, as they impose an inappropriate and ineffective cognitive overload on the user.

Olfactory interfaces have already been developed, but will not have a significant commercial development because in many consumer applications they are just useless. Moreover, they are very constraining to manage since they need to store the odor chemicals.

Anthropo-technico-economic conditions and limitations

The adoption time of technological innovations by a large number of consumers has been increasingly short over the last century and up to today. It took more than fifty years to democratize the plane and the telephone, forty years for the radio, twenty years for most people to own a television; fifteen years for a daily practice of the laptop, less than ten years to surf the Internet and about three years for the diffusion of the iPod. But it will not be the same with virtual reality, because its uses strongly imply the sensorimotor behaviour of human beings, which was not the case for the introduction of the other technological innovations mentioned above.

On the other hand, this does not mean that it will take half a century for VR+ techniques to be used by the greatest number of people. Some RV+ applications will be used daily while others will be temporarily used in special situations. Techno-economic change will be slower than some predict, but it will happen gradually, whatever happens. RV+ techniques are partly based on computer technology. It is thus not the evolution of computing, always very fast, which will be a brake for the development of RV+ applications. As we have said, it is not just a question of processing data. It is a question of acting, of living, in an artificial environment with all the anthropo-technical-economic constraints and limits.

The transition from analog to digital is a real upheaval for sectors of activity that operated with analog devices: television, radio, cinema, photography, etc. The upheaval is not only in the technical transformation of the equipment and working methods of these industries. It is also in the opening of new uses. One of the most emblematic is the material transition from classic camera to 360° camera (impossible to make in analog), leading to a new art: VR movies! For the moment, not all the potentialities of digital technology have been exploited in VR+, given the difficulties of modeling and animating the artificial environment and virtual characters that must have a certain autonomy of action. Another difficulty is to model in real time the user's behavior, the actions they undertake, to assist them or to propose an artistic work adapted to their attitude. This will only be possible with the development of AI algorithms.

Regarding the user's behavior, the risks of discomfort, even discomfort up to kinetosis, are not an economic brake for any designer who knows how to control, not just one sensorimotor incoherence, say cybersickness, but all the sensorimotor inconsistencies of its application.

The three main inconsistencies are the latency induced by the visiocouple, the oculomotor incoherence and the visuovestibular incoherence. The removal of negative health impacts are, for the first, under the responsibility of headset manufacturers, for the second, under the responsibility of the developer, as long as headset manufacturers do not offer adaptive accommodation displays, and for the third, under the responsibility of the developer. If the latter creates other sensorimotor inconsistencies, especially for unrealistic LCAs, it must control the corresponding inconsistencies. Any user will be more or less sensitive to these inconsistencies. The developers will have to test their application, at least on a panel of users (and not on themselves!). It will be preferable for health and economic reasons that the VR application is adaptable according to the sensitivity of each individual. It is illusory that in the medium or long term all individuals adapt to all types of sensorimotor inconsistencies. Let us note that man has been sailing for 130 000 years and some of us still get seasick!

All areas of media, arts, culture and communication are impacted by the transition to digital. Before this, sectors of activity were separated technically and therefore economically. This is no longer the case and all sectors of activity can encroach on others. Since all media are digital, there are no longer any impassable boundaries, at least at the technical level, between television, video games, social networks, cinema, and future RV+ applications for the general public: the Internet offers films and videos, social networks and search engines have taken over the majority of advertisements, television can be watched on a smartphone or a computer, on which one can consult newspapers or launch a video game, some of which allow for training or interactive artistic experiences, etc.

Overall, all sensorimotor activities, not just the two-sensory “see and listen” activity, can be performed on many digital devices accessible and used by all. In this new technico-economic context, how will RV+ applications, professional or not, develop, knowing that, today, the large international companies, GAFAM and others, have a very great financial power to answer all the uses (media, arts, culture, recreational activities, education and communication), even to impose them, on a world level? Can economic ecosystems join, or will some absorb others? It is at least certain that the uses of VR+ will be as varied as the uses of computers, even if this is not yet the case.

This text is taken from the new book by Philippe Fuchs,  Théorie de la réalité virtuelle. Les véritables usages  (A Theory of Virtual Reality : The Real Uses), published by Presses des Mines in April 2018.

Philippe Fuchs
Professor, chair Robotics and VR, PSA Peugeot-Citroën / Mines ParisTech - PSL