Topic: Non-visual virtual reality ouput devices

Good Afternoon, I will be serving you (non visual) Output devices that could be used in a Virtual Reality environment, for the next half hour, please relax and enjoy...

I will talk about three types of devices:

1) For the Main course I will talk about Hearing and sound in three dimensions

2) For Desert I will serve you Force feedback and Tactile sensations

3) And to finish off I will get up your nose with the sense of smell

==========================

1) 3-D Audio

A quote from Silicon Mirage The Art and Science of Virtual Reality published in 1992

  "It has been demonstrated that using sound to supply alternative or supplementary information
   to a computer user can greatly increase the amount of information they can ingest"

   Aukstakalnis, S., and Blatner, D. 1992. Silicon Mirage The Art and Science of Virtual Reality. Berkeley, California: Peachpit Press, Inc.

o but, you know that, already ...
      o We live in a world with a constant influx of sounds tells us much about the surroundings
              o sight echo / reverb gives cues about direction / distance of objects
              o small rooms echo less than ones with cathedral ceilings
              o out-of-sight objects can be heard, acting as cue when looking for object
              o sound offers information on material, surface texture, hardness of surface, walls and floors.
              o the squelch of mud, boots on hard tiles, polished floors, thick carpets conveys much about the surfaces ...

      o Ability to synthesize spatial sound would greatly add to immersive virtual reality,

   Just as in the real world, sounds in a Virtual Reality must incorporate within them:
           o location of sound (where the sound "appears to be")
           o the path the sound has to travel to reach you
           o single/multiple sound sources interaction,
           o background noise

   o Sounds are constantly present and around us and present us with cues about our surroundings.

How do we localize sound?

   o There are eight Localisation Cues (according to [BURGESS92])
           o Interaural time difference - describes the time delay between sounds arriving at the left and right ears
                   o it is the primary localization cue for the lateral position of a sound
                   o front / behind ~0 ms
                   o far left / right ~ 0.63 ms
                   o frequency (and linear distance) factors as well

           o Head shadow - filtering of sound as it goes around or through head to reach other ear
                   o can significantly attenuate sound and filter frequencies
                   o can cause difficulties in determining distance and direction of static head.

           o Pinna response
                   o pinna = species of bivalve mollusk OR fleshy cartlege (sp?) of outer ear in humans (good word eh?)
                   o higher frequencies filtered which affect ability to perceive lateral(azimuth) and elevation of sound.
                   o response of pinna 'filtering" highly dependent on direction of sound source (good)

           o Shoulder echo (!)
                   o Frequencies 1 - 3 kHz are reflected from upper torso of humans
                   o the reflection produces echoes that the ears perceive as a time delay partly dependent on elevation
                       reflectivity dependent on frequency
                   o not a primary auditory cue, others are more significant for sound localisation.

           o Head movement
                   o Natural movement of head a key feature in determining sound source
                   o increases as frequency increases (high frequency does not "bend' around objects as much)
                   o Required as (few) humans have muscles capable of moving the pinna

           o Early echo response / reverb
                   o Sounds are combination of original sound and reflections off surfaces ...
                   o Early echo response occurs in first 50-100ms of sound
                   o Combination of early echo and the reverberation following seems to effect both distance and direction judgement.
                   o Deadness of sounds in a echo-absorbing chamber

           o Vision
                   o Seeing the sound source quickly locates it and confirms the direction ...

OK, now that you see that "getting the sound to come from the right direction" is no mean feat, I'm sure you understand why finding the correct relative weighting of each characteristic is still an area of research ...

Aside - sound recording ...
=======================
   o MonoAural - with one microphone simuluating one ear you get no position information
   o Stereo - with two microphones separated by air simulates ears several feet apart with air between them <grin>
   o Surround sound - adds more microphones, more speakers, gives better illusion until the plane taking off goes past
                     the listener's elbow instead of overhead ... (recording =/= playing environment => problem.)
   o Biaural - recording microphones embedded in "dummy" head - closer to what humans hear (if dummy head realistic)

OK, now to prepare a mapping of the sound to the perceived sound at the user's head

   One way of modelling human acoustic system is to take binaural recordings with probe microphones in ears of real people
   which are compared with the original sound. A function (Head-related transfer function (HRTF)) can be computed
   (for each ear) that, based on the sound source's position, frequency attempts to take into account many of the above
   cues mentioned above. The HRTF can be then used to develop a set of (Finite impulse response (FIR)) filters / functions
   for specific for a specific position - for each ear..
   At the end to position a sound in a certain position in virtual space, one feeds the sound through the FIR for that
   location to produce spatial sound (from that location).
   o headZap can take the 168 HRTF (in spherical pattern)in less than 30 mins in an un-prepared room!

To put it politely, the processing required to convolute the sound signal is quite demanding, ten years ago, in 1992, it was
   considered impossible to do in real-time without specialised hardware.
   In 1992 Crystal River Engineering implemented these convolving operations on a digital signal processing chip called the
   Convolvotron, ("low end" two channel version ~$1,800 to four sound model ~$15,000 circa 1992).
   (the company seems to have died),

   o further support / enhancement taken over by AuSIM (which is alive)

   http://www.ausim3d.com/products/ausim3d.html
   o AuSIM - offers hardware and software solutions that "
   "offer a set of mathematical algorithms, implemented in robust, object-oriented software that simulate the
   propagation of audio from an imaginary emitter (or "sound source") through a modeled environment to a human listener (or "sound sink").
    AuSIM3DTM is the most physically-based 3D audio technology that exists."


   Sound rendering

   o this starts to sound a lot like ray tracing .... <see nice graphic>, sound is reflected, refracted, absorbed
   in a similar manner to light - however sound is also modulated over time, and with multipath transmission
   has additional overheads to creation of realistic representation.

   o two passes - first calculate propregation path for each object / source to each microphone, to produce the
   transformation for each transmission path, the delay and attenuation.
                - second create the sound, then sum all sounds to generate the final sound track.

   o actual sound rendering a four step pipelined process:
           o create / generate object's sound (recorded / synthesised)
           o attach sound to object's movement (and listener's movement)
           o calculate necessary convolutions to describe interactions with environment and other sounds
           o apply convolution to sound sources to produce final result.

   o reverberation is an important spatial cue, "a convolution with continuous weighting function"
           o really just multiple echoes within the sound environment producing diffuse reflections
           o Diffraction of sound allows sound to go "around" an object - frequency dependent - has
             a smoothing affect on the sound.
           o allows a simplied sound tracing algorithm - [which is beyond scope of this paper <hehehe>
             see [TAKALA92]
                   (why does this sound so much like a recent lecture on light source modelling??)
           o {This method handles the simplicity of an animated world that is not necessarily
              real-time; it is unclear how this method would work in a real-time virtual reality application.
              However, its similarity to ray-tracing and its unique approach to handling reverberation are a
              noteworthy aspects. [FOSTER92]

   Simple (not very effective) methods
   o "Visual Synthesis Incorporated's Audio Image Sound Cube" - ignore most of the sound cues, focus instead
      on attaching sounds to objects in 3D space - create a cube of (eight) speakers (any size).
      Use pitch and volume to simulate sound location. This method is fast because it avoids the need to
      convolute the sound - much less computationally demanding - allows for much less expensive real-time
      spatial sound. (~$8,000 in 1993)

   Problems with spatial sound (and some solutions)

   o Front-to-back reversals - sounds like behind, is in front (and vise versa)
           o diminished by accurate tracing of head movement and modelling of pinna response
           o diminished by use of additional auditory cue such as early echo response
           o inclusion of first order echo response allowed fron-to-back differentiation for most test subjects
                    [BURGESS92]

   o intracranially heard sounds - "the train ran through my brain"
           o diminished by adding reverberation cues


   o HRTF measurement problems
           o probes are non-linear and prone to noise
           o speakers used to generate the test sound lack bass response. [BEGAULT90]
           o HRTF profile contains element specific to the test victim^H^H^H^H^H^H^H subject
           o diminished by using several primary auditory cues, can result in a HRTF good enough
             for most of the population [BURGESS92]

   Background SoundScape
   o Less computationally challenging, but important to make the VE "believable", it is background.
   o Non directional (ambient?), however in the real world a person can still pick out threads
     of conversation, for instance "cocktail party" effect, which requires a 3D acousic field ...
   o Attempts to use prerecorded sounds do not work if placed in a 3D soundfield, the user cannot
     interact, only observe ...



   Applications
   o Significant increase in immersion and aid to orientation
   o Substitution of other sense feed back eg. without haptic feedback, button press could "click"
     when using "wired glove".
   o Compensation for sensory impairement of user. "Mercator" project - use sound as a non-visual
     interface to X-Windows for visually impaired software developers [BURGESS92]. Attempt to map
     visual objects behavior into audio space.
   o increase "bandwidth" of communication to user.

                           ======================


   ==================== Spare sound stuff =================

1) 3-D Audio
        Why?
        What's in a sound?
                Eight Localization cues
                Sound recording
                Head mapping (HRTF)
                        Problems
                Convolution and rendering
                        Problems
                        Non-VR Applications
                A few words on Background sound

2) Tactile and Force Feedback devices
        What and Why?
        Motion Platforms
        VR (2D) mouse
        Gloves
        ExoSkeletons
        GROPE, PER handcontroller
        MagLev Haptic
        Phantom
        Butlers
        SandPaper system
        Teletach Command

3) Olfactory Feedback
        What is it?
        Early interest
        Difficulties
        Commercial product (are they real?)


Finis - Question time?

      o Background sound

      o requires massive computational power and speed because - hearing is complex
              o relates to the shape of the outer ear
              o micosecond delay between ears determines position and location of source

      o to simulate a virtual sound environment:
              o determine position of sound source relative to listener
              o calculate effects of environment between source and ear
                      for example, echo from wall requires placement of source apparently on other side of wall


   Problems that arise:

   o tuning on an individual level, when sound reaches the outer ear, the outer ear bends the sound wavefront and channels it down the ear
   canal ... the sound that reaches the eardrum is different for each person.
   To get around this one can place probes in the ear and produce HRTF, one problem is that the probe itself
   changes the acoustical track.
   o HRTF does not take into consideration the middle or inner ear.

We have so far covered sound from a particular point in the user's sphere of hearing



   ========================================================

2) Tactile and Force Feedback devices

o Lack of tangibility - no single interface currently simulates all the interactions of
        o shape, size
        o texture
        o temperature
        o firmness
        o force, mass

o The area of touch has two main components:
(a) walking into a wall should be umm, noticeable
(b) Tactile feedback deals with how the object feels

o Force Feedback

        Motion Platforms

        o originally designed for flight simulators
        o needs to be well synchronised with visual display to be effective
        o consists of a platform with hydraulic arms capable of (limited) tilting
                eg. inverted flying is hard to simulate!
        o however, if well synchronised with visual can add significantly to immersion.


        Virtual Reality Mouse (1993) - (NOT really VR)

        o A US$1305 motorised, forcefeedback, talking mouse to use on Windows95 (wow!!!)
        o When the mouse cursor contacts a (virtual) wall, extra force is required to pass through it
        o Can produce "gravity well" effect
        o Can provide damping proportional to it's velocity
        o Can produce variable friction effect
        o can read any text off the screen for you
        o on the internet, you feel the dialog box, while it reads the prompt out to you!
        o Resolution 2222 dpi
        o Maximum force; 9.1 N

        Gloves

        o allows interaction with small objects to provide (some) feedback,
        eg. holding a virtual object on your hand.
        o Rutgers Master II has pneumatic pistons in the palm of the glove
          to increase the resistance as your fingers close around it.

        o Begej Glove controller provides tactile and force feedback to three fingers.(NFS)
          o provide exoskeleton mechanism to provide necessary resistance
          o provide "taxels" (arrays of small tactile display elements on pads of fingers.

        o TiNi Alloy Tactile Feedback System (~$7,000 in 1993)
          o Use shape memory alloys to provide temperature tactile feedback.
          o feedback "displayed" by heating the memory metal element positioned on the hand (glove).

        o In 2000 experiments were conducted to produce controlled resistance to movement
          o use of Electro-Rheological Fluids (electric field controlled viscosity)
          o required upto 4.5 kV (low current)(from a 9V battery)
          o concept to have cylinders in glove with controllable resistance to open/close fingers
            (ECS - Electrically Controlled Stiffness).


        Exoskeletons

        o Simulate resistance of objects in a virtual world.
        o basically a robotic arm strapped onto a person,
        o Example from Uni of Utah has a 10DOF robot constantly updating the force to each of
          it's ten joints so that the 50 pound arm appears weightless (until you wish some "virtual force")
        o When the user touches something they feel actual forces through the exoskeleton, the user
          really can feel they "walked into a wall" or feel the weight of an object.
        o (Prob. a good idea to calibrate the maximum force??)

        GROPE-III (UNC)
        o modified arm previously used for radioactive handling to create forces back to the controller
          to reflect virtual forces in VE.
        o used to help chemists "feel" attractive and resistive forces of molecules reacting and bonding
          to each other ...
        o working volume about one cubic metre

        PER Force Handcontroller (Cybernet Systems)
        o small desk (or other flat surface) mounted 6 DOF force reflection/feedback device
        o user grabs handle grip to use system (light and small)
        o forces generated by six DC servo motors.


        Magnetic Levitation Haptic Interfaces
        o provides 6DOF controller with one moving part
        o user feels the motion, shape, resistance, and surface texture of simulated objects.
        o Noncontact actuation and sensing
        o High control bandwidths
        o Position resolution and sensitivity
        o Motion range: 15-20 degrees rotation, translational 25 mm
        o Maximum stiffness of 25 N/mm at 1500 Hz servo
        o Maximum force and torque of 55 N and 6 Nm
        o Levitated flotor : 550 g 2.5 W to levitate

Phantom we all meet at CSIRO upstairs ...

        Butlers (Proposed at 1995 IEEE conference)

        o A robot that gets in the way whenever you try to go through a (virtual) object.
        o If the user reaches out to touch a wall, desk, other virtual object, a real object is
          put in the correct position.
        o claim to be able to simulate inertia, viscosity and stiffness ...
        o able to present these properties for a single point at a time


Texture
        o Sandpaper system of MIT/UNC
        o able to accurately simulate several grades of sandpaper
        o a small mouse like device providing sensation via perceived
          changes to the bottom of the mouse surface depending on the virtual surface passed over.

        o Teletact Command
           o uses air filled bladders sown into a glove or piezo-electric transducers
             to provide sensation of pressure or vibration
           o (piezo-electric messes up electromagnetic trackers [like Polhemus tracking system]).

==================== Spare Parts for tactile ==========================

Difficult because of variety of nerve functions ie. temperature sensors, pressure sensors,
rapid-varying pressure sensors, sensors to detect force exerted by muscles, and sensors to
detect hair movements on the skin...

Fingers:
   o up to 135 sensors per sq cm at finger tip
   o sensitivity up to 10,000 Hz when sensing texture, most sensitive at 230Hz
   o forces at above 320Hz are sensed as vibration
   o Force on individual fingers should be less than 30-50 N total
   o average user can exert index 7 N, middle 6 N, ring 4.5 N without fatigue

   ===================================================================================

3) Olfactory Feedback

o Olfactory interface is the least mature of all the technologies.

o Early interest back in 1994 was shown by the US Military, for telemedicine but it prompted no development at that time.
        o odour becomes important in a few simulations - most notably in surgical simulations
        o they asked research on "special" odors required for particular applications
           like human blood or liver required for surgical simulation.

Difficulties:

o humans are able to detect some odours at one part per million
o much better at detecting increase rather than decrease
o However it has been reported that only a third of odours can be detected
without some other sensory input.
o becomes very important to ensure venting allows odour removal ...

Storage of
o The Technology for Storage of odorant storage as liquid, gels or waxes had been around for a while.
o Systems had been designed to use air streams to deliver the smell to the user,
via a solvent gas such as carbon dioxide, via a hose to the user.
o With the development of the ink-jet printer, low power suitable for Head mounted Devices could be developed to allow precise control and delievery of odorants.

Two Commercial products
o virtual olfactory displays, also called "odour generators", for personal computer
  include AromaJet, and TriSenx. (There was a third, DigiScents but it went kaput in 4/2001.
o AromaJet Pinoke is planned to sell in the US$40 - $80 range and make use of sixteen base aromas
  at 1% intevals, mixed on command. Demonstrated Dec 2000.
o AromaJet - game use to simulate the smell of burning rubber, hot engine, exhaust fumes.

o Trisenx (live 8/2002) offers software kits and the ability to preorder the scentStation (or buy the hardware
and get the software free!) for around US$270.
o Air draft blows the required odor towards the user within seconds ...
***Can produce flavored wafers as well***

Problems:
o lack of standardisation and agreement of "base odours" as well as limited range available
in any one device may limit wide use. (Need something like Pantone for colours, for odors).

===========================================================================

That concludes a brief summary of non-visual output devices, I expect a lot of development of
the devices and technologies, this area that seems relately immature compared with,
say visual output devices.

--- Any questions?

============================================================================

References

An Investigation of Current Virtual Reality Interfaces
(http://www.acm.org/crossroads/xrds3-3/vrhci.html)

Tonnesen, C., and Steinmetz, J. 1993. 3D Sound Synthesis. Encyclopedia of Virtual Environments. World Wide Web
URL: http://www.hitl.washington.edu/scivw/EVE/I.B.1.3DSoundSynthesis.html

3D Sound Synthesis - Cindy Tonnesen and Joe Steinmetz
http://www.hitl.washington.edu/scivw/EVE/I.B.1.3DSoundSynthesis.html

[BURGESS92]: Burgess, David A. "Techniques for Low Cost Spatial Audio", UIST 1992.

Steinmetz, J., and Lee, G. 1993. Auditory Environment. Encyclopedia of Virtual Environments. World Wide Web
URL: http://www.hitl.washington.edu/scivw/EVE/III.A.2.Auditory.html

[TAKALA92]: Takala, Tapio and James Hahn. "Sound Rendering". Computer Graphics, 26, 2, July 1992.

[FOSTER92]: Foster, Wenzel, and Taylor. "Real-Time Synthesis of Complex Acoustic Environments" Crystal River Engineering, Groveland, CA

[BEGAULT90]: Begault, Durand R. "Challenges to the Successful Implementation of 3-D Sound", NASA-Ames Research Center, Moffett Field, CA, 1990.

Force and Tactile Feedback - Corde Lane and Jerry Smith
http://www.hitl.washington.edu/scivw/EVE/I.C.ForceTactile.html

Magnetic Levitation Haptic Interfaces - Peter J. Berkelman and Ralph L. Hollis
http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/msl/www/haptic/haptic_desc.html

Controlled Compliance Haptic Interface Using Electro-rheological Fluids
http://cronos.rutgers.edu/~mavro/papers/EAPAD_erf_5.PDF

Virtual Reality: Past, Present, and Future - Enrico Gobbetti and Riccardo Scateni
http://www.crs4.it/vvr/bib/papers/vr-report98.pdf

Virtual olfactory interfaces: electronic noses and olfactory displays - Fabrizio DAVIDE, Martin HOLMBERG, Ingemar LUNDSTRÖM
Communications Through Virtual Technology:Identity Community and Technology in the Internet Age
Edited by G. Riva and F. Davide, IOS Press: Amsterdam, 2001 - © 2001, 2002, 2003
http://www.psicologia.net/vrbook/chapter_12.pdf