Jump to content area the CNICE logo, four people dancing around a globe and a book

Canadian Network for Inclusive Cultural Exchange

Search the CNICE website:

Jump to Content
5: Representations of Visual Geo-Spatial Information

5.1 Introduction

There is a strong visual bias to web content. This document aims to provide some solutions to barriers in accessing online geo-spatial content, particularly to those who are blind or have low vision. It discusses sensory modality translation techniques, considerations and guidelines, resources, and the design of recommended digital media formats capable of conveying spatial information. Some of the content involves relatively new technologies, such as tactile interfaces. It is hoped that government on-line content designers can benefit from this document, and in turn, a larger spectrum of their clientele.

This discussion is primarily about tactile and audio cues in electronic map and web3D environments: how to enhance detail, give context, and ease navigation and way-finding. It bootstraps off of previous work done at the Adaptive Technology Resource Centre of the University of Toronto. The CANARIE Health Education Network project Adding Feeling, Audio and Equal Access to Distance Education had the technical objectives of facilitating the creation of accessible environments, and included a model courseware module designed to assist in teaching geography to students who are blind. Major research influences on this document were:

  • Carleton University's Drs. Taylor and Siekierska et al work on audio-tactile mapping on the web with voice annotations and haptic feedback, to be found on the Natural Resources Canada site http://tactile.nrcan.gc.ca.
  • Drs. Bulatov and Gardner at Oregon State's Science Access Project, who gave several new paradigms for tactile information display.
  • Dr. Calle Sjöström at Certec, Lund Institute of Technology, Sweden, on non-visual haptic interaction design.
  • Dr. Norm Vinson of the National Research Council's Institute for Information Technology, who has kindly provided his research on navigation in virtual environments.

This document starts with the more familiar, computer-based 2D geo-spatial context, usually in the form maps and explores transformations from visual media to audio modes, progressing to the sense of touch. The on-line graphical format Scalable Vector Graphics (SVG) is a primary focus, with resources and application techniques provided. The Web-based 3D technologies of Virtual Reality Modeling Language (VRML) and its offspring X3D are in turn explored, with more resources and application techniques on the issues. Discussion of sensory modality complementarity and supplementary guidelines on designing for navigation and way-finding in virtual environments wrap up the document.

5.2 2D Content

Online visuals no longer entail simple static imagery. The complexities of information display call for more intuitive ways of accessing meaning through alternate or concurrent channels. Most online geo-spatial content takes the form of maps, which are not only essential but extremely valuable tools. Digital maps on the Web are often provided through increasingly flexible Geographic Information Systems (GIS) and allow people from different sectors and disciplines to make and use maps to analyze data in sophisticated ways.

5.2.1 Translations: Visual to Sound

In cartographic documents symbols are defined as "the graphic elements shown on a map designed to represent geographic features or communicate a message, at a given scale" (The Atlas of Canada, Natural Resources, atlas.gc.ca/site/english/learning_resources/carto). Map symbols can be pictorial or abstract in nature. Similarly aural symbols can either be metaphorical or abstract in nature. In a cartographic environment there is also the possibility of providing the user with sound actually recorded at the site being represented. While this possibility remains untested, possible uses for this type of representation are worthwhile. In this section text will also be discussed as a possibility for representing elements of a cartographic document.

Aural Text (Voices)
Textual descriptions of objects are the most intuitive way to describe elements of a map. For example, having a voice say "school" easily and informatively does the task of identifying a school symbol. Also, aural text can easily provide additional information about the school without a steep learning curve from the user (provided the user speaks the language being spoken in the aural text).

There are two ways aural text can be integrated into a user interface. Recording a person's voice saying the text is perhaps the most obvious way to include aural text. The sound files generated in the recording can then be attached to the appropriate objects to be played according to the parameters of the interface (activating the sound on mouse over, on focus, on keystroke).

Consideration: Currently Adobe's SVG viewer will play wav and au files. The SVG standard has yet to implement a standardized way of attaching sounds to objects in an SVG environment. The delay in this implementation was caused by a lack of open source sound format. Currently the only open source sound format is ogg. Future SVG browsers will probably be capable of playing ogg files.

The second way to include aural text as part of an interface is to implement a text to speech engine, which will allow the computer to speak any electronic text out loud in a synthesized voice. When a text to speech engine is implemented all that need be done is attach the electronic text version of what is to be said to the object it is to represent. Aural interfaces, such as screen readers, implement such text to speech engines.

Consideration: A significant advantage of this method of including aural text is the saving of file size. Electronic text takes up very little space and leaving the creation of the audio to the interface is very efficient. However, the trick to successfully making electronic text available to a text to speech engine is knowing where to put it. In most well developed technologies an aural interface such as a screen reader will be able to flexibly access any electronic text contained in a document. In the case of SVG, a technology that is still early in its development, a screen reader may not be able to find electronic text unless it is put in a very specific place in the file. So far we have been able to show that a screen reader can access electronic text contained in the "title" tag, but not the "label" or "text" tags.

When a designer is creating aural text for an interface there are several things that should be kept in mind. The text should be as clear and concise as possible. Just as map symbols visually communicate information in clear and concise ways the aural text that is attached to a symbol must be equivalently clear at delivering the information in an aural format. If recorded sound is being used the quality of sound must be such that the voice is easily comprehended.

If more than one voice is being used to disseminate the information there are some additional factors the designer must consider. When a person listens to disembodied voices the natural thing for the persons mind to do is to establish context. The easiest way to do this is by establishing what relationship the voices have. If two voices are present the listener will try to establish what roll each voice is playing (teacher/student, boss/employee, buyer/seller,) and which voice is dominant. As the number of voices increases the number of possible relationships between the voices will increase exponentially. As it is not desirable to have the user preoccupied with trying to determine which voice is speaking in what context, it is best to keep the number of voices being used to a minimum. While it is important to minimize the number of voices being used, multiple voices can assist the user in quickly comprehending information if different voices are used to delineate types of information. For instance, one voice may be used to give directional information, while another is used to voice basic symbol titles, and perhaps yet another voice gives geographic information.

Consideration: Use two different voices in the description of an object, one voice to give the type of data, another voice to deliver the specific data itself. For example:
Voice 1: School
Voice 2: University of Toronto, St George Campus

Abstract sounds Sounds of an abstract nature comprise a category that encompasses all the beeps, whistles and buzzes computer applications make to convey information. The sounds may also be musical in nature. It is very easy to make this sort of sound short with respect to time and small with respect to storage space. Sounds are very good for giving cause and effect related feedback to users and to indicate the state of an element. The other value of abstract sounds is that it is easy to sonify data about the object being represented. For instance, a sound representing a topographical line might be higher or lower depending on the height the line represents. This gives the user information about what map symbol they are viewing and the approximate value it represents. This sound variation is an excellent way to communicate a relatively large amount of information in a very short amount of time. There is a long history of aural interfaces being used to sonify data as in the case of hospital heart monitors or Geiger counters.

Consideration: Of all the attributes a sound can have (pitch, duration, volume) pitch or change in pitch is the most effective at communicating the state or change in state of an object (Walker and Ehrenstein, 2000).

While abstract sounds can be used to identify objects it becomes quickly difficult for a user to memorize a large list of objects represented with abstract sounds. Abstract sounds may also have musical qualities (melody, harmony, rhythm) that can be used to enhance its ability to contain information. This possibility is useful when expressing more complex data sets. A set of abstract sounds with musical attributes could be used to give a user a summary of the type and scope of the elements contained in a geographic area (Joseph and Lodha, 2002).

Consideration: Rhythm is the most memorable element that a group of sounds can have. A list of abstract sounds that employ rhythm as one of their defining characteristics will be easier for a user to remember.

Aural metaphors Aural metaphors are sounds that are reminiscent of sounds in the real world. It would seem that these types of sounds would be a natural fit for representing symbols on a map, as a map is a representation of the real world. These types of sounds work well for representing objects as the sound itself assists the user in identifying or recalling what the associated object is. This basis in the real-world can greatly reduce the learning curve a user goes through as they learn to read the map, and reduce reliance on the legend.

Consideration: Aural metaphors work very well for representing real world objects, such as types of buildings (school, bank, museum), areas (park or quarry), or infrastructure (roads, highways, transit or rail). There are some exceptions to this general rule of thumb. There are some real world objects that do not make sound or are not intuitively associate with a sound, such as a mountain range or even a generic building. Maps also represent abstraction of the real world such as political boundaries or topographical measurements. Aural metaphors don't naturally represent these mapping elements.

Most objects in the real world that easily have sounds associated with them do not make a single type sound. For instance, a park can easily be associated with the sound of children playing, dogs barking, playground equipment squeaking, or people talking. A designer must approach most objects that will be represented with an aural metaphor with the intention of using synecdoche, where the part will represent the whole. For example, a park may be represented by the sound of a child's laugh.

It is difficult to control the length of aural metaphors as they must endure long enough allow the metaphor to be recognized, and many real world sounds have an absolute length. Lengthy representations of symbols mean that it will slow the user down while looking for the information they want, and file sizes will be larger. In many cases this length may be unavoidable, in which case the effect on the ability of the user to efficiently read the map must be considered.

Consideration: If a real world sound is to be edited for length there are some things to keep in mind. The human ear identifies a sound based on its attack (the beginning of a sound) and its decay (the end of a sound). Any changes to the attack or delay of a sound will affect the listener's ability to identify the sound. The body of a sound (the middle part) is the most generic to the human ear. The body tells the listener how loud the sound is and how long it lasts, but its quality is not used to identify the sound. If a sound is to be changed in length the changes must be made to the body of the sound if it is to remain identifiable to the human ear.

Location based sounds There is a fourth possibility for representing elements on a map with sound that could be referred to as location based sounds. These sounds would be used in the representation of actual places and would consist of a recording taken from the place being represented. For instance a location based sound for a specific park would involve recording a short (and hopefully representative) amount of time of the sounds from the park. Such a representation could give much information about the element being represented to the listener, who upon listening to the sound could potentially determine whether there was a playground in the park, whether there was a body of water in the park, or whether dogs were commonly found in the park. To the best of the writer's knowledge, this sort of representation has never been implemented in a geo-spatial context, and the practical usefulness of such representations is not known. Spatialized Audio

Maps express geo-spatial data in a two dimensional, visual, symbolic format. When considering an aural alternative to visual geo-spatial maps it seems natural to take advantage of our ability to hear in three dimensions.

Aural displays (generally, acoustic speakers) are capable of creating the illusion that a sound is coming from a specific location. A stereo auditory display uses two speakers to generate the illusion of a sound coming from a specific spot, which can be located anywhere between the two speakers. This illusion, called the stereo image, is created by playing the sound louder on one speaker, creating the impression that the sound is located between the two speakers, though closer to the louder. The successful creation of a stereo image depends on the correct placement of the speakers, preferably at ear level and at the correct distance apart. This placement and method can provide the illusion of sound coming from a very specific location. A sound designer attaches panning or EQ effects to a sound to allow it to be displayed in three dimensions by a stereo system.

Panning is a feature of an electronic sound that has been authored for spatialized display systems, such as stereo or surround speakers. The panning effect is measured in percentages of left or right; centered referring to the state of being 0% left and 0% right. If panning is set to 100% right, the stereo image will give the illusion of a sound source coming from the right speaker. Likewise, panning 100% left gives the illusion of sound coming from the position of the left speaker. This is not to say that when panning is set to 100% right, there is no sound coming from the left speaker. If a person hears a sound that comes from their front-right, their left ear will still pick up some of that sound. To create a true illusion of sound coming from the far right, some sound would still have to come from the left speaker. The further an object is, the more sound a person will hear in the ear furthest from the source. The closer an object, the less sound a person will hear in the ear furthest from the source, until the source is so close that the sound is only heard in one ear. (Thus, the effect of something whispering in one ear.) We call this attribute of panning divergence, which can be manipulated by a sound designer to further create the illusion that a sound is coming from a specific location. If divergence is set to 100%, the sound source will seem to be extremely close to the listener. If divergence is set to 0%, sound will always be equal in both speakers, creating the effect of a distant sound coming from beyond the horizon.

Consideration: Remember that panning and divergence have relative values, displayed differently on different sets of stereo speakers. It is impossible to express absolute proximity or position using panning. The position of the sound source will stay relative to the position of the two speakers, with the absolute position varying from one set of speakers to another.

EQ (equalization) is the determination dominant frequencies in a sound. The effect of EQ on the spatial effect is related to how distinctly a sound stands out when played in the context of other sounds. If the frequencies that are distinguished in a sound are unique in the soundscape, the human ear will easily hear it as being in the foreground and distinct. A sound that shares its dominant frequencies with other sounds in a soundscape will "blend," causing the sound to recede into the aural background.

If a designer creates a group of sounds playing at the same time in order to depict a complete set of geo-spatial data, distinctness is vital. Ensuring that each sound has its own space in the spectrum of frequencies will help users distinguish different sounds from one another. The process of consciously doing this is called "carving EQ holes."

Consideration: When carving EQ holes, determine whether the sound that is to be predominant (or made "present") is in the low (20Hz-100Hz), mid-low (100Hz-500Hz), middle (500Hz-2kHz), mid-high (2kHz-8kHz) or high (8kHz-20kHz) frequency range. Next determine what the central frequency of the sound is, and boost it. Take all other sounds in the group sharing the same frequency range with the sound in question and reduce that same frequency.

Consideration: Sibilance is the hissing sound people make when they say the letter "s." When a recorded sound has sibilance, there is a hissing accompanying the actual sound. If the sound designer notices that a sound has strong sibilance, they can lower the EQ of the sound at 6-8kHz to eliminate the hiss. The Aural Legend

On a traditional map the reader will find a legend detailing the meaning of symbols found on the map and explaining the mapping method used (such as scale or measurement systems.) The reader will typically use the strategy of glancing back and forth between the legend and the area being viewed. The reader of a map who uses an aural interface will need access to the same information. Due to the limited scanning ability of the aural sense this strategy would not work well for the latter, especially if the number of items represented in the legend were extensive.

Consideration: It becomes much more important to the user of an aural interface that objects represented on a map have their characteristics intrinsically associated with its representation.

One of the greatest differences between visual and aural interfaces with respect to geo-spatial maps is the issue of scale. A necessity of visual geo-spatial maps, scale becomes irrelevant to the user of an aural interface.

Consideration: All measurements of distance can be given in real terms with out any encumbrance on the usability of the map. Information Scalability

In the study "Multichannel Auditory Search: Toward Understanding Control Processes in Polychotic Auditory Listening" (Lee, 2001), Mark Lee observes that our aural attentional systems experience overload faster than our visual attentional systems when presented with data sets of increasing size. That is to say, when a person is presented with a large amount of concurrent aural data, his/her ability to synthesize the data is quickly compromised. Lee found that our ability to pick information out of multiple data sets (that are spatially differentiated) is far more limited than comparative tasks using the visual sense.

Consideration: When displaying geo-spatial mapping data with sound, avoid overwhelming users with too much concurrent information.

A single map can give information about topography, political boundaries, infrastructure (roads, rail, shipping), population (cities, towns, regions), or structures for an area potentially covering the entire earth. Geo-spatial maps are potentially among the most complex data sets possible, simultaneously giving great quantities of many different types of information. The limited ability of the auditory sense to scan complex sets of data must be considered when an aural interface is designed, especially when the information to be disseminated is know to be a complex set.

Consideration: In the case of geo-spatial mapping data it becomes very important for the different types of data to be contained in different layers that can be hidden and revealed by the user. Providing this ability to hide and reveal information enables the user to create a simple data stream that can be more easily scanned with the aural sense while containing the most relevant information.

However the user may require many types of information delivered at once in a related context. This would require many layers to be revealed to the user plaited in the output of the aural interface. To compensate for the complexity of this output the user should have another way to scale the delivered information.

Consideration: Another way geo-spatial mapping information can be scaled is by restricting the size of the area being displayed. Just as an ideal aural interface for geo-spatial information should have different types of information separated into layers the total area of a geo-spatial map should be divided so as to allow the display to be restricted to a single area. For instance, if a user is able to restrict the displayed area to a single intersection, many different types of information can be displayed (streets, buildings, topography) without overwhelming the scanning ability of the aural sense. Translation Algorithms

Translation algorithm suggests aural equivalents to the visual elements of a geo-spatial map. Keep in mind that across modalities, there is simply no one-to-one full translation equivalent. In choosing approximate meaning equivalents for visual mapping data we have three types of sound to choose from: aural text, real world sound (relating to object type), and abstract sound effect.

To make an aural map useful a method of navigation that is independent of hand eye coordination is very important. For a small and simple map section, such as would be used for the following example, we believe that selecting the arrow keys to navigate from object to object will allow the users to orient themselves to the geo-spatial content. When a user moves cursor focus onto a map object there is an initial sonification consisting of an audio icon (sound) and short descriptive text. The user may then use the tab key to prompt for even more audio information about the object, such as further description or directional information.

Naturally in a larger or more complicated geo-spatial map, such simple navigation would not suffice. More elaborate methods of filtering and selecting and navigating through information is suggested in the geo-political translation algorithm to follow.

Table 5 and Table 6 represent aural equivalents for a mapped intersection consisting of two streets, four buildings/structures, and a Park. The two charts, representing the same intersection, test the representation of the mapped elements fusing aural metaphors in the first chart, and abstract sounds in the second chart.

Urban Intersection Scenario
The following translation matrix conveys the sounds in an aural interface for an urban intersection map.

Audio icon
Descriptive Text
Directional Text
Street 1 Traffic James Street North/south
Street 2 Traffic Helen Avenue East/west
Intersection Toronto Intersection sound James and Helen Map coordinates x, y
Building 1 (school) Chalk on board School; Saint George Elementary North west corner; James and Helen
Building 2 (Bus Shelter) Bus doors opening bus doors James street bus stop; bus # 54 South east corner; James and Helen
Building 3 (Bank) Coin First national bank North west corner; James and Helen
Building 4 (Library) Book pages Brook View public library North west corner; James and Helen
Area 1 (Park) Children playing/cheering Begrundi Memorial park North of Helen on James; west side

Table 5: Real world or aural metaphor sounds for an intersection map.

Audio icon
Descriptive Text
Directional Text
Street 1 Abstract 1 James Street North/south
Street 2 Abstract 2 Helen Avenue East/west
Intersection Abstract 3 James and Helen Map coordinates x, y
Building 1 (school) Abstract 4 School; Saint George Elementary North west corner; James and Helen
Building 2 (Bus Shelter) Abstract 5 James street bus stop; bus # 54 South east corner; James and Helen
Building 3 (Bank) Abstract 6 First national bank North west corner; James and Helen
Building 4 (Library) Abstract 7 Brook View public library North west corner; James and Helen
Area 1 (Park) Abstract 8 Begrundi Memorial park North of Helen on James; west side

Table 6: Abstract sounds for an intersection map.

Geo-Political Map Scenario
The following translation matrix conveys the sounds in an aural interface for a geo-political map.

Geographic Element
User Action
Non-speech Audio
Latitude and Longitude Lines Proximity    
  Over Elastic, Boing  
  Mouse Click   "X Longitude"
Provincial Boundaries Proximity Rapid clicking  
  Over Sucking in and then out sound "Provincial Boundary between x and y"
  Mouse Click    
Provinces Proximity    
  Mouse Double Click Three note ascending scale with clarinet, continuous Shape of Ontario
Regions, Maritimes, Central Canada, Prairies, Arctic, West Coast Proximity    
  F1 key Three note ascending scale with oboe, continuous X Region of Canada
Cities Proximity Busy City Street Sound, cars honking, people talking, traffic  
  Over Increase volume upon entering zone, interrupted by speech Name of City spoken
  Mouse Click   City and Population Size spoken
Provincial Capital Proximity Busy City Street Sound, cars honking, people talking, traffic  
  Over Increase volume upon entering zone, interrupted by speech Name of City spoken
  Mouse Click   City, Capital of X, Population Y
Canadian CapitalOttawa Proximity National Anthem  
  Over Louder interrupted by speech Speak "Ottawa, Canada's Capital"
  Mouse Click   City and Size spoken
Lakes Proximity Small wave lapping with loon sound  
  Over Louder interrupted by speech Lake Y
  Mouse Click   Lake Y, size in Square miles
Rivers Proximity Whitewater river sound  
  Over Louder interrupted by speech Y River
  Mouse Click   Y River, x miles long, flowing into y
Waterways Proximity Harbour sound, boats chugging, tooting  
  Over Louder interrupted by speech St. Lawrence Seaway
  Mouse Click   X waterway, route from x to y
Ocean Proximity Waves, Seagulls  
  Over Louder, Interrupted by speech X Ocean
  Mouse Click    
Towns Proximity Playground sounds  
  Over Louder interrupted by Speech Name of Town
  Mouse Click   Name of Town, Population X
Major roadways Proximity Highway, Cars zooming past  
  Over Louder interrupted by speech Highway x
  Mouse Click    
Railways Proximity Train sounds  
  Over Louder interrupted by speech X Rail
  Mouse Click   X Rail between Y and Z
International Boundaries Proximity American Anthem  
  Over Louder interrupted by speech "International Border with the United States"
  Mouse Click    
  Mouse Double Click    

Table 7: Geopolitical map translation matrix.

5.2.2 Translations: Visual to Touch

For general discussions on types of Haptic Devices see Haptic Devices and Visuals to Haptics. This document focuses on tactile effect authoring software for the consumer level haptic device technology (called TouchSense) licensed by Immersion Corporation. Full Force and Tactile Feedback Distinctions

Immersion Corporation defines the difference between their force and tactile feedback devices as follows (www.immersion.com/developer/technology/devices):

The term full force feedback is used to refer to devices that apply sideways forces to your hand to resist your motion or give the impression of turbulence, recoil, impact, G-forces, or countless other phenomena. If the device can physically push on you, it's probably a full force feedback device. Many force feedback gaming devices fall into this category.
The term tactile feedback is used to describe devices that play high-fidelity tactile sensations, but generally won't move or inhibit the movement of either the device or the hand holding it. A tactile feedback device can play a wide variety of distinguishable taps, textures, and vibration effects to communicate with the user and greatly enrich the computing or gaming experience. A number of pointing and gaming devices fall into this category. Haptic Effects Provided by Immersion Corporation

Following are pre-set, editable effects offered from the Immersion Studio library. While there is a large selection for force feedback devices, there is a small selection for tactile feedback devices, due to their limited nature.

Effects for Immersion Full-Force Feedback Devices:

  • Angle Wall Effect - feel like a wall at an arbitrary angle
  • Axis Wall Effect - feel like vertical or horizontal wall
  • Damper Effect - feel like dragging a stick through water
  • Ellipse Effect - used to attract the cursor to the inside of the ellipse, keep the cursor outside of the ellipse, or attract the cursor to the border surround the ellipse
  • Enclosure Effect - used to attract the cursor to the inside of the rectangle, keep the cursor outside of the rectangle, or attract the cursor to the border surround the rectangle
  • Friction Effect - feel like pushing a mass across a rough surface
  • Grid Effect - creates a 2-dimensional array of snap points or snap lines
  • Inertia Effect - feel like pushing a mass on wheels
  • Slope Effect - feel like rolling a ball up and/or down a hill
  • Spring Effect - feel like compressing a spring
  • Texture Effect - causes mouse to feel as if it were traveling over a series of bumps
  • Periodic Effect - feel like a simple back and forth motion or a high frequency vibration
  • Pulse Effect - identical to Vector Force but its default duration is much shorter to create a "pop" sensation
  • Ramp Effect - creates a force that either steadily increases or steadily decreases over a discrete time
  • Vector Force Effect - constant force over a discrete time

Effects for Immersion Tactile-Feedback Devices

  • Ellipse Effect - a small "pulse" will be emitted when cursor is over the border of the ellipse
  • Enclosure Effect - a small "pulse" will be emitted when cursor is over the border of the rectangle
  • Grid Effect - creates a 2-dimensional array of lines which emits a "pulse" when cursor is over it.
  • Texture Effect - causes mouse to feel as if it were traveling over a series of bumps by pulses as the mouse moves.
  • Periodic Effect - high or low frequency vibrations
  • Pulse Effect - create a "pop" sensation Tactile Effects Applied to Geospatial Content

Each effect has parameters (such as duration, gain, magnitude, phase, waveform) that can be finessed for specific results. User action of proximity refers to cursor being outside of but in a predefined number of pixels from the element.

Following is a translation algorithm for representing typical map elements with haptic effects using a full-force feedback mouse or trackball.

Geographic Element
User Action
Possible Haptic Effects
Description of Effect
Latitude and Longitude Lines Over Spring Elastic
Provincial Boundaries Proximity Slope Thin trench on both sides of curtain
Over Spring Elastic curtain
Provinces Mouse Double Click Ramp Relief of province rises/Alternatively sinks
Regions, Maritimes, Central Canada, Prairies, Arctic, West Coast F1 key Ramp Relief of Region Rises/Falls (toggled)
Cities Proximity Ellipse Slight gravity well around sphere
Over Slope 1/2 sphere
Provincial Capital Proximity Enclosure Slight Gravity Well around Pyramid
Over Slope Pyramid
Canadian Capital Ottawa Proximity Ellipse Slight gravity well around stars
Over Slope 1/2 sphere
Lakes Proximity Slope Depression around lake
Over Periodic Pulsing waves of resistance
Rivers Over Slope Trench
Waterways Proximity Slope Trench along sides
Over Periodic Washboard in direction of flow
Ocean Proximity Slope Depression Around Coast
Over Periodic Pronounced Waves
Towns Over Slope Small slightly raised square
Major roadways Over Slope Raised ribbon with indentation in middle
Railways Over Texture Raised ribbon with horizontally running indentations
International Boundaries Over Slope Very high Edge

Table 8: Translation algorithm for representing typical map elements with haptic effects.

Unlike Full-Force feedback devices, which can create the effect of resistance when the pointing device is over an object, tactile feedback pointing devices are limited to variations of pulses and vibrations. As a result, the number of geographic elements that can possibly be represented by tactile effects is very limited, and should be accompanied by other modalities of representation (see 5.4 Modality Complementarity).

Consideration: Pulses experienced through a tactile feedback device may not be very informative since a single "bump" may indicate that either a user has passed over or is currently on an object. Consider, for example, an image of a thin line such that when the cursor is over the image, a single pulse is emitted. However, when a user is moving the cursor about and passes over the image, the user will also feel a pulse. If the goal for the user was to move the cursor along the line, it will be very difficult for them to distinguish between being currently over the image or having just passed over it.

Periodic effects can best be utilized in geo-spatial applications as follows:

Vibrations with a constant frequency or amplitude can be used as indications for a single geographic element, such as roads or relief. Using the example of roads, varying the frequency can indicate the type of road when the cursor is over the object. A high frequency might be used for highways, low frequency for side streets.

Vibrations with varying frequencies or amplitudes can be used as indications for specific objects, such as towns, buildings or points of interest. For example, a building can be indicated by a "buzzing" effect where vibrations cycle from large to small amplitudes when the cursor is over the object.

5.2.3 Format: Scalable Vector Graphics (SVG)

Scalable Vector Graphics (SVG) is an XML compliant graphics format being developed by the World Wide Web Consortium (W3C, www.w3.org) to be the dominant method of rendering graphics on the Web. An open standard, it is being adapted by conventional web browsers. Full SVG specifications can be found at http://www.w3.org/TR/SVG. While this technology continues to undergo development, it already shows great promise in terms of creating accessible, interactive graphical environments.

Access to information can be greatly enhanced through interactive features made possible by the SVG format. By allowing a user to hide different elements, information can be simplified and made easier to read in a SVG document. A strength of the SVG format is the ability of the creator to layer information. By grouping similar elements in a layer within the context of a larger document allows the user to more easily locate the element, or group of elements that contain the information they are searching for.

Geo-spatial content is typically rendered in a graphical medium with an iconic language that is very difficult to translate into another medium. SVG gives the map creator the ability to associate electronic text with graphical elements and layers of graphical elements. This electronic text, unlike the iconic language of geo-spatial maps, is easily transformed into another mode, such as audio. An audio interface such as a screen reader can render the electronic text in the form of a spoken voice giving the aural user access to a graphical element. Of course, the creator of a map in the SVG format will have to include descriptive text for the aural interface to read. SVG authoring programs in the future should contain prompts allowing content creators to easily include text information about the elements and layers they are creating.

Earlier iterations of the SVG standard had some limitations that affected the accessibility of SVG documents. Java script handlers controlling keyboard control, vital to the success of an aural interface, could not be used due to a lack of keyboard support in the Document Object Model (DOM) before the inception of level 3. This lack keyboard support prevented testing aural interfaces built with SVG for the duration of this project. With the inception of DOM level 3 validation on January 27, 2004 and the continued development of SVG 1.2 (still in working draft as of March 18, 2004) this problem has been solved. It is not known how many of the SVG browsers have implemented support for the working version of SVG 1.2.

In testing SVG documents with the JAWS screen reader it was shown that JAWS could read "title" tags, but we not able to get "desc," "label," or "text" tags to be read. The scripting group at Freedom scientific (maker of the JAWS screen reader) has not yet addressed JAWS' ability to work with this relatively new standard. Resources

Accessibility features of SVG are discussed in detail at the W3C's Web Accessibility Initiative (WAI) site www.w3.org/TR/SVG-access.

The Mapping for the Visually Impaired portal (MVI, tactile.nrcan.gc.ca) at Natural Resources Canada is a forerunner in attempting to make maps and geo-spatial data accessible on-line. Associated MVI members presented a seminal paper at the summer 2003 SVG OPEN conference, "SVG Maps for People with Visual Impairment" (www.svgopen.org/2003/papers/svgmappingforpeoplewithvisualimpairments), describing techniques for creation of audio-tactile maps in SVG and Flash formats. Some of this work bootstrapped off of trailblazing by Dr. John Gardner (himself blind), Vladimir Bulatov and team at Oregon State University's Science Access Project (dots.physics.orst.edu). A paper describing their development and application of the Accessible SVG Viewer, incorporating haptic feedback and object description via a speech engine, was presented at the 2001 CSUN conference, "Smart Figures, SVG, And Accessible Web Graphics"(www.csun.edu/cod/conf/2001/proceedings/0103gardner.htm).

Not relevant to SVG so much as to Cartography and Map Symbols, Natural Resources Canada's online Atlas of Canada (atlas.gc.ca/site/english/learning_resources/carto) has a highly readable section (intended for students and teachers) entitled The Fundamentals of Cartography, which includes subsections Map Content and Design for the Web and Cartographic Symbology. Methodology

Our recommendation for associating haptic effects (using Immersion TouchSense) with web-based visual information using SVG has been previously described in 5.2.3 Visuals to Haptics. The procedure following differs in that it adds detail for adding sound, and provides code samples. User requirements are:

  1. An SVG viewer plug-in associated with their browser, such as the Adobe SVG Viewer, found at www.adobe.com/svg/viewer/install
  2. Immersion Web Plug-in, found at www.immersion.com/plugins

The general procedure3 for adding accessible graphics to HTML (in this context, a map, with audio and tactility (using Immersion-licensed TouchSense devices)) is as follows:

1. Construct the SVG
Any text editor can be used. Third-party software, such as Adobe Illustrator 10 or 11 (www.adobe.com/products/illustrator), can greatly speed up development. Code Sample 1: An SVG that draws a fill red circle.

		<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
		<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN"    
			"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd" >
		<svg xmlns="http://www.w3.org/2000/svg"
			xmlns:a="http://ns.adobe.com/AdobeSVGViewerExtensions/3.0/" >
			<circle cx="25" cy="25" r="20" fill="red"/>

2. Organize Objects in Layers for Interactivity
Group all elements in appropriate layers and label both elements and layers. For example, in a map setting, put symbols for train station, bus station, and airport into one group, primary, secondary and tertiary roads into a second group, mountains and bodies of water forming two more groups. All interactive elements need to be grouped, which creates a group tag in the SVG document; in Illustrator, use the layers palette. Each element identifiable by screen readers needs a label, which will in turn be spoken.

3. Assign Haptic and Sound Effects Assign JavaScript functions to layers. Associate each haptic effect with a sound effect, which makes map features more evident to people with visual impairment, by placing before the opening of the group tag an audio anchor with the attribute xlink:href set to the file name of the sound to be played. The event handler must also be set in the audio anchor with the "begin" attribute. Each category of map elements gets assigned a separate haptic effect with JavaScript, e.g. one for roads, one for built-up areas, one for water and occasionally one for borders. JavaScript functions to detect user input and trigger the events; in Illustrator, use the interactivity palette. Otherwise, for those versed in JavaScript, code can be added to the SVG with a simple text editor. Code Sample 2: Using Sample 1 code, group objects together, add sound and assign JavaScript functions as needed (shown in bold).

		<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
		<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN"    
		"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd" >
		<svg xmlns="http://www.w3.org/2000/svg"
			  xmlns:a="http://ns.adobe.com/AdobeSVGViewerExtensions/3.0/" >
		  <a:audio xlink:href="button.wav" begin="mybutton.mouseover"/>
		  <g id="mybutton" onmouseover='parent.startHaptic()'>
			  <circle cx="25" cy="25" r="20" fill="red"/>

4. Incorporate SVG and haptic effects into HTML
Use an EMBED tag to place the SVG graphic in a webpage. An object reference to the Immersion Web Plug-in must be made, along with JavaScript referring to the tactile effects. Code Sample 3: The minimum HTML code required to implement haptics with SVG.

("Sports.ifr" is a resource file containing haptic effect "Punch" created with Immersion Studios.The previous sample provides "circleWithJavaScript.svg".)

		<BODY onload="IMM_openIFR();">
		<!-- Object reference to the Immersion Web Plugin must be included in the HTML.-->
			id="ImmWeb" height="0" width="0">
		<SCRIPT LANGUAGE="JavaScript">
		var imm_eref;
		function IMM_openIFR( )
			imm_eref = document.ImmWeb.OpenForceResource("Sports.ifr");
		function startHaptic( )
			document.ImmWeb.StartEffect("Punch", imm_eref);
		<EMBED NAME="circle" SRC="circleWithJavaScript.svg" WIDTH="700" HEIGHT="400">

5. Add both Immersion Web plug-in and user environment detection to HTML
Use an OBJECT tag and JavaScript to detect presence of the Immersion plug-in associated with users' web browser. (Code supplied in Appendix. It is available from Immersion Developer site www.immersion.com/developer/technology/tutorials/index.php, log in required, within the tutorial "TouchSense in JavaScript - Web Tutorials: Detecting Immersion Web Plugin".) GIS Applications and SVG for Government On-Line (GOL) Content

Maps are not only essential but powerful tools. Digital maps on the Web and in Geographic Information Systems (GIS) allow more and more people to make and use maps to analyze data in sophisticated ways. GIS applications are found in various government sectors, business, the natural and social sciences, urban planning and management, and many other fields. The two most prevalent GIS software applications are ESRI's ArcView and MapInfo, both of which do not have native SVG output. Fortunately, two third-parties have recognized the promise and utility of the SVG format, and have developed extensions to give ArcView and MapInfo better web deliverability with SVG export.

ESRI's ArcView (www.esri.com/software/arcview) is perhaps the most popular desktop GIS and mapping software, with more than 500,000 copies in use worldwide. ArcView provides data visualization, query, analysis, and integration capabilities along with the ability to create and edit geographic data. MapViewSVG (www.uismedia.de/mapview/eng) is a third party extension for ArcView that converts maps within ArcView to the SVG format for dissemination on the Web. All vector based objects and text objects are converted into SVG, which can be panned and are infinitely zoomable without losing cartographic quality. Various elements can be turned on and off in the display, queries made, and many other features including web and email links, scale bars and overview maps, coordinate read-outs and interactive measurement tools.

MapInfo (www.mapinfo.com) is another leading business mapping solution that allows users to create detailed maps, reveal patterns and trends in data that may otherwise be hard to discern, perform data analysis, manage geographically based assets, etc.

DBx Geomatic's SVGmapmaker (www.dbxgeomatics.com) is a program that exports MapInfo content to SVG in such a way that the result looks as much as possible like the original document in MapInfo. Several options are also available directly in MapInfo that take advantage of SVG's advanced graphic capabilities, such as filter effects, color gradients, opacity, hyperlinking, etc. DBx Geomatics is a Canadian company, with a strong clientele across federal (Public Works and Government Services Canada, Environment Canada, Canadian Coast Guard, Fisheries and Oceans Canada, Treasury Board of Canada Secretariat) through to municipal governments.

5.3 3D Content

3D content is usually an object, collection of objects, or a virtual environment. This document focuses on web3D, which entails an on-line, bandwidth efficient technology. Panoramic technologies are not discussed herein. Web3D is a navigable, interactive experience involving more than the spin and zoom nature of panoramas. This format allows for a large variety of time and user triggered events, with cause and effect through scripting and potential interface capabilities such as query and filter.

The term "Web3D" generally refers to a spectrum of on-line interactive 3D technologies, many of them proprietary. While we are concerned with the predominant, open-source standards of Virtual Reality Modeling Language (VRML) and X3D, readers are encouraged to extend accessibility and usability considerations herein to other formats.

5.3.1 Transformations: Visual to Sound

For individuals who are blind, environmental sounds can reveal much more perceptual and navigational information about a modern urban landscape than the sighted would fathom. The following scenario is used by Massof (2003, p.271) to describe the level of aural acuity involved:

After vision, hearing has the broadest band for acquiring information. Blind people rely almost entirely on hearing to perceive the environment beyond their reach. A highly skilled blind pedestrian can approach an intersection, listen to traffic, and on the basis of auditory information alone, judge the number and spatial layout of intersecting streets, the width of the street, the number of lanes of traffic in each direction, the presence of pedestrian islands or medians, whether or not the intersection is signalized, the nature of the signalizations, if there are turning vehicles, and the location of the street crossing destination.

Massof concedes that "not all blind people demonstrate this skill" (2003, p.271); however, this example is still useful in illustrating how a 3D soundscape may be used to convey geo-spatial information to individuals who cannot view a map, or see their virtual environment surroundings.

Not only man made, but natural sound sources are important perceptual cues, though, as Potard (2003) states, they are rarely implemented or properly controlled in virtual 3D sound scenes. Sound sources such as a beachfront or waterfall usually exhibit a particular spatial extent which provides information on the physical dimensions, geometry and distance of the sound emitting object.

In way-finding tasks, auditory beacons can prove extremely useful in guiding a user to locations that are not visible due to buildings or walls. In observing the navigation paths of subjects, Grohn et al (2003) found that users used auditory cues to define approximate target locations, preferring visual cues only in the final approach.

Many of the consideration made in Spatialized Audio can be applied to 3D contexts. However, more sophisticated 3D sound playback technologies can be found in almost all current computer sound hardware, which support not only two channel stereo, but surround speaker systems which are both widely available and affordable. Web3D technologies have usually supported spatialized audio playback, but newer browsers are pushing the sound envelope with streaming, multi-channel aural content.

Vision, in these implementations, can be combined with different types of sound, like speech, real-life sounds (auditory icons) or abstract musical sounds (earcons). Each of these types of sounds has advantages and disadvantages. The benefit of speech, for instance is that most of the time, the meaning of the message is relatively unambiguous. However, users need to listen to the whole message to understand the meaning. Visual to Touch

Calle Sjöström and the Haptics Group at Certec (http://www.english.certec.lth.se), Lund Institute of Technology, Sweden, has created the doctoral dissertation, "Non-Visual Haptic Interaction Design: Guidelines and Applications" (www.english.certec.lth.se/doc/hapticinteraction). They present five guidelines for researchers, designers, testers, developers and users of non-visual haptic devices:

  • Elaborate a virtual object design of its own
    • Avoid objects with small and scattered surfaces. Objects with large connected surfaces are easier to find and explore
    • Use rounded corners rather than sharp ones.
    • Virtual objects in virtual worlds can be given virtual properties. Utilize them.
    • Optimize your haptic interface widgets as well. Think about affordance.
    • Make sure that the models are haptically accurate and work without vision.
    • Be aware that orientation of the object matters.
    • Consider different representations to enhance different properties (negative relief emphasizes the line whereas positive relief emphasizes the contained surface).
  • Facilitate navigation and overview
    • Provide well defined and easy-to-find reference points in the environment.
    • Avoid changing the reference system.
    • Make any added reference points easy to find and to get back to. They should also provide an efficient pointer to whatever they are referring to.
    • Utilize constraints and paths, but do so with care.
    • Virtual search tools can also be used.
  • Provide contextual information (from different starting points)
    • Present the haptic model or environment in its natural context.
    • Provide information about the purpose of the program.
    • Provide information about possibilities and pitfalls in the environment
    • Use a short text message such as a caption to an image or model, provided as speech or Braille. This can make a significant difference.
    • Idea: Consider using an agent or virtual guide that introduces the user to the object and also gives additional information if requested.
  • Support the user in learning the interaction method and the specific environments and programs
    • Be consistent; limit the number of rules to remember.
    • Give clear and timely feedback on the user's actions.
    • Facilitate imitating of other users and situations if possible.
    • Develop elaborated exercises to make the handling of the interaction tools and methods automatic in the user.
    • Idea: Consider using a virtual guide or remote users to help when a user comes to a new environment.

These guidelines represent the learning the research group had gained during the testing of blind subjects using haptics devices in tasks such as the recognition of both geometrical and real life objects (with texture) and navigation in a traffic environment. Their work showed strong hope for effectively using haptic technology to translate both 2D and 3D graphical information and make it comprehensible to the Blind. Not only did they demonstrate that a blind person can orient and navigate in a virtual haptic environment, but that these tasks can be more efficient with complementary information via sound or Braille. They ascertained that a blind person can use virtual world familiarization for real life scenarios.

For virtual environments, the researchers also provide five basic prerequisites for being able to work efficiently:

  • To be able to explore, understand and manipulate the objects in the environment
  • To navigate and to gain an overview
  • To understand the context
  • To use all modalities that are normally used
  • To learn the interaction method and the specific environments/programs Visual to multimodal

In Sjöström's work, an emphasis on the need to utilize multiple sensory modalities is made. His guidelines for including sound are as follows:

  • Utilize all available modalities
  • Combine haptics with sound labels, a Braille display and/or synthetic speech for text output.
  • Try environmental sound to aid in getting an overview.
  • Use audio (both sound labels and environmental sound) to provide a context.
  • Provide feedback to the user via any available sense.

5.3.2 Format : Web3D - Virtual Reality Modeling Language (VRML)/X3D

Virtual Reality Modeling Language (VRML) is the ISO standard for interactive 3D web content. While there are many proprietary and open-source types of online, interactive, three-dimensional technologies, VRML has been the primary Web3D format. Various working groups have developed the next generation called X3D (X for eXtensible), which is both backward compatible with VRML and integrated into XML. Specifications of VRML and X3D can be found at www.web3d.org/x3d/spec.

VRML does provide spatialized audio capabilities, though text-to-speech (i.e. passing text to a speech synthesizer) isn't. The benefit of speech, in contrast to abstract sound, is that meaning of the signal is relatively unambiguous. However, users generally have to listen to the full duration to comprehend. The Adding Feeling, Audio and Equal Access to Distance Education Project undertaken by the Adaptive Technology Resource Centre (University of Toronto) was able to take advantage of the VRML PROTO node to enable such a capability in the form of SpeechClip nodes. Features included the ability to set or modify voice, volume, speech rate, and pitch, as well as some ability to control the "play back" of the speech: interrupt a speech, pause/resume a speech, and loop a speech. However, the ideal implementation of text-to-speech in VRML could not be fully implemented at the time due to JSAPI (Java Speech Application Programming Interface) and browser security issues.

As X3D is an XML based technology, the best hope for seamless speech processing in Web3D may be in the parallel developing markup language efforts of HumanML (www.humanmarkup.org) and the Humanoid Animation Working Group (H-ANIM, www.h-anim.org). As for tactile interface support, there is very little in VRML. The Adding Feeling, Audio and Equal Access to Distance Education Project rendered VRML scenes haptically, though it was specific to a force feedback device no longer commercially available. An obvious choice of haptic device technology to pursue VRML/X3D compatibility with is the most widespread and affordable-Immersion Corporation's TouchSense line (www.immersion.com/consumer_electronics/applications/mice_trackballs.php). Methodologies

The following two procedures implement VRML with haptics effects readable by Immersion-licensed TouchSense devices.

Method 1: HTML plus VRML plus a Java Applet using EAI

Approach: Traditional VRML with Java using the External Author Interface, interface which was designed to allow an external program (the applet) to access nodes in the VRML scene.

User requirements:

  1. Web3D Viewer with complete VRML support and EAI (External Author Interface) support. (Cortona VRML Client from Parallel Graphics (www.parallelgraphics.com/products/cortona) works well).
  2. Immersion Web Plug-in, found at www.immersion.com/plugins
  3. Required additional java classes:

*Cortona places these classes in corteai.zip ("old" and "new" External Authoring Interface (EAI) for Internet Explorer), plywood.jar (SAI and old EAI for Netscape). The default location of these are C:\Program Files\Common Files\ParallelGraphics\Cortona\corteai.zip and C:\Program Files\Netscape\Netscape\Plugins\plywood.jar

General procedure:

1. Create HTML with the necessary JavaScript code needed to add touch sensations to web pages. Code Sample 1: Initial HTML with JavaScript to enable immersion effects


	<BODY onload="IMM_openIFR();">
	<!-- An object reference to the Immersion Web Plugin must be included in the HTML.-->
		height="0" width="0">
	var imm_eref;
	function IMM_openIFR()
		imm_eref = document.ImmWeb.OpenForceResource("Sports.ifr");
	function startHaptic()
		document.ImmWeb.StartEffect("Punch", imm_eref);

2. Create VRML content with events to be detected (in our case it was collision) Code Sample 2: VRML generating a red sphere that will detect collisions

	#VRML V2.0 utf8
	DEF COLLIDER Collision{
	children [
	Transform {
	  children Shape {
	    appearance Appearance {
	     material Material {  diffuseColor 1.0 0 0  }
	    geometry Sphere { }
	collide TRUE

3. Create a small applet to detect events from VRML using EAI. Detected events will call the JavaScript functions in the html to start effects. Code Sample 3: Java applet used with VRML from Step 2 to detect collisions.

	import java.awt.*;
	import java.lang.*;
	import java.applet.Applet;
	import netscape.JavaScript.*;
	import netscape.JavaScript.JSObject;
	// Needed for External Authoring Interface (EAI)
	import vrml.external.*;
	import vrml.external.field.*;
	import vrml.external.exception.*;
	import vrml.external.Node;
	import vrml.external.Browser;
	public class EAI extends Applet implements EventOutObserver{
	  Browser browser = null;
	  Object eventKey;
	  JSObject win;
	  boolean once = true;
	  int counter = 0;
	  public void init(){
		 // Get a reference to the Navigator Window:
		try {
		  win = JSObject.getWindow(this);
		catch (Exception e){
			// Don't throw exception information away, print it.
		// Get the VRML Browser:
		browser = Browser.getBrowser(this);
		// Get a handle on the TouchSensor node (in this case it is the COLLIDER node):
		Node ts = browser.getNode("COLLIDER");
		// Get collission time from the event collideTime 
		EventOutSFTime bump = (EventOutSFTime)ts.getEventOut("collideTime");
	  // What do when when an event occurred
	  public void callback(EventOut event, double timestamp, Object key){
		// Retrieve the state of isActive (see above the advise function)
		EventOutSFBool state = (EventOutSFBool) key;
		// Muffle the number of calls made to the callback method
		// (and consequently the haptic device as well) by only allowing
		// a call to JavaScript once every 20th time.
		if (counter % 20 == 0)   // calls js every 20th collision
		  counter ++ ;
		// Call JavaScript in HTML
		public void callJS()	{
		  win.call("startHaptic", new String[1]);

4. Embed the VRML and Applet into the HTML. Code Sample 4: Embedding VRML and applet into the HTML from Step 1

("Sports.ifr" is a resource file containing haptic effect "Punch" created with Immersion Studios.)

	<BODY onload="IMM_openIFR();">
	<!-- An object reference to the Immersion Web Plugin must be included in the HTML.-->
		height="0" width="0">
	var imm_eref;
	function IMM_openIFR()
		imm_eref = document.ImmWeb.OpenForceResource("Sports.ifr");
	function startHaptic()
		document.ImmWeb.StartEffect("Punch", imm_eref);
		<EMBED NAME="vrml" SRC="sphereWithCollision.wrl" WIDTH="700" HEIGHT="400">
		<applet name="EAI" code="EAI.class" MAYSCRIPT align="baseline" width="500" height="100">
			<PARAM NAME="name" VALUE="EAI">

Additional Note: Only works with JVM from Microsoft, which no longer will be supported after September 30, 2004. In future, Cortona's VRML client will support the latest versions of Java and Java Virtual Machine from Sun. In the meantime, this version of Java can be used with a solution offered by J-Integra from Intrinsyc (j-integra.intrinsyc.com). See details at www.parallelgraphics.com/developer/kb/article/162

Method 2: HTML plus VRML plus a Java Applet using Xj3D

Approach: This method uses Xj3D, a toolkit written in Java for VRML97 and X3D content. It was a project initiated by Sun Microsystems in 1999 and handed over to the Web3D Consortium. The toolkit allows users to incorporate VRML97 and X3D content into their Java applications, such as using it to load VRML/X3D content.

However, it is important to note that Xj3D in still in development. Although the toolkit in its present state can be utilized effectively for some applications, it may be inadequate for visual-to-haptic applications. The current status of Xj3D (www.xj3d.org/status.html) indicates that many of the nodes/features, such as collision, have been implemented or partially implemented and tested locally, but have yet to be proven compliant with all Web3D published conformance tests.

Theoretically, when Xj3D is fully compliant, it may be used effectively in visual-to-haptic applications. This would eliminate the need for additional plug-ins, such as Parallel Graphics' Cortona. Furthermore, since the toolkit was written with Java 1.4 it is fully compatible with Sun Microsystems' JVM. The Xj3D toolkit can be obtained from www.xj3d.org/download.html.

User requirements:

  1. Immersion Web Plug-in, found at www.immersion.com/plugins
  2. Required additional java classes:
    Download the latest stable release of the toolkit from www.xj3d.org/download.html to implement an xj3d browser capable of viewing VRML.

General procedure summary:

  1. Create HTML with the necessary JavaScript code needed to add touch sensations to web pages.

  2. Create VRML content with events to be detected (in our case it was collision).

  3. Create a small applet to load the VRML file to an Xj3D browser and to detect events from VRML using EAI. Detected events will call the JavaScript functions in the html to start effects.

  4. Embed the Applet into the HTML. VRML/X3D and GIS


GeoVRML, an amendment to the Virtual Reality Modeling Language specification (and incorporated in VRML's offspring X3D), gives web3D a geographic coordinate specification. The GeoVRML Working Group of the Web3D Consortium has developed tools and recommended practices for the representation of geographical data using the web3D. The group's goal is to enable geo-referenced data, such as maps and 3D terrain models, to be viewed over the web with conventional browsers.


Geo-spatial example scenes in VRML, X3D and XHTL formats are available at the Web3D Consortium's site www.web3d.org/x3d/content/examples/GeoSpatial

The 3D Metanet Atlas Platform ("3MAP", www.ping.com.au) is an initiative to represent the Earth in it's entirety, giving users the ability to navigate (or zoom down) to the planet's surface for finer detail of any geographic region, or further still, rural setting-given existing data linked on the Web for that specific setting. Standards utilized for the platform include that of the Open GIS Consortium, Java and X3D/VRML. User navigation is limited (as unconstrained movement in 3D environments often disorients the user) to four modes:

  • Orbital - examine or rotate the globe from space
  • Satellite/map - "bird's eye view" looking down
  • Flight - above land or cityscape with view angle control
  • Walk - across land or through cityscape, in an enhanced "target lock facility"

An added user interface component allows the user to dynamically query content or filter out unwanted information.

5.3.3 Navigation and Way-finding Guidelines

Norm Vinson of the National Research Council's Institute for Information Technology had published the notable work "Design Guidelines for Landmarks to Support Navigation in Virtual Environments"4 In it, he addresses user difficulty in gaining and maintaining bearings and contexts by promoting proper landmark provision.

The guidelines are not only applicable to aiding user navigation of geographic content. Vinson writes, "As computers become more powerful and 3D visualization more common, fly-throughs of networks (electronic or biological) and complex molecular structures, simulations, and data visualizations will also pose navigational problems." Following are only excerpts of core elements from the paper, where individual guidelines may not be clearly elaborated; refer to the original for clarity. Vinson makes constant use of the acronym VE, which refers to Virtual Environment.

Learning about an Environment

Newcomers to an environment rely heavily on landmarks as points of reference. As experience with the environment increases, navigators acquire route knowledge that allows them to navigate from one point in the environment to another. Route knowledge is acquired and expanded by associating navigational actions to landmarks.

Guideline 1: It is essential that the VE contain several landmarks.
Generally, additional experience with the environment increases the representational precision of route distances, and of the relative orientations and positions of landmarks. Additional experience may also transform the representation from route knowledge to survey knowledge. Survey knowledge is analogous to a map of the environment, except that it does not encode a typical map's top-down or bird's-eye-view perspective. Rather survey knowledge allows the navigator to adopt the most convenient perspective on the environment for a particular task. Survey knowledge acquired through navigational experience also incorporates route knowledge. In comparison to route knowledge, survey knowledge more precisely encodes the spatial proprieties of the environment and its objects.

Landmark Types and Functions

To include landmarks in a VE, one must know what constitutes a landmark. In his seminal work on urban planning and cognitive maps, Kevin Lynch found that people's cognitive maps generally contained five types of elements: paths, edges, districts, nodes, and landmarks.

Guideline 2: Include all five types of landmarks in your VE:

Types Examples Functions
Paths Street, canal, transit line Channel for navigator movement
Edges Fence, river Indicate district limits
Districts Neighborhood Reference point
Nodes Town square, public bldg. Focal point for travel
Landmarks5 Statue Reference point into which one does not enter

Table 9: Five landmark types.

Landmark Composition

It is important to include objects intended to serve as landmarks in a VE. However, it is also important that those objects be designed so that navigators will choose them as landmarks. There are two issues regarding the way in which landmarks should be constructed. One issue relates to the landmark's physical features. The other issue relates to the ways in which landmarks should be distinctive.

Guideline 3: Make your (manmade) landmarks distinctive with features contributing to memorability of both object and location:

  • Significant height
  • Expensive building materials & good maintenance
  • Complex shape
  • Free standing (visible)
  • Bright exterior
  • Surrounded by landscaping
  • Large, visible signs
  • Unique exterior color, texture

Guideline 4: Use concrete objects, not abstract ones, for landmarks.
A study of VE landmarks also suggests that memorable landmarks increase navigability. Landmarks consisting of familiar 3D objects, like a model car and a fork, made the VE easier to navigate. In contrast, landmarks consisting of colorful abstract paintings were of no help. It was felt that the 3D objects were easier to remember than the abstract art and that this accounted for the difference in navigability.

In a natural environment, any large manmade object stands out. Accordingly, experts in orienteering (natural environment navigation) relied most on manmade objects as cues when navigating. They also used land contours and water features. However, they tried not to rely on vegetation since it is a rapidly changing, and therefore unreliable, feature in natural environments.

Landmarks in Natural Environments

Manmade Items Land Contours Water Features
roads hills lakes
sheds slopes streams
fences cliff faces rivers

Table 10: Sample landmark items.

Guideline 5: Landmarks should be visible at all navigable scales.
Consider environment scales that differ from that of a city. For instance, on a larger scale, a cognitive map of a country could have cities themselves as landmarks. It is not unusual for a user to have the ability to view a VE at different scales by "zooming in" or "zooming out". In such cases, it is important for the designer to provide landmarks at all the scales in which navigation takes place.

It is important to remember that the distinctiveness of an object is a crucial factor in its serving as a landmark.

Guideline 6: A landmark must be easy to distinguish from nearby objects and other landmarks.

Guideline 7: The sides of a landmark must differ from each other.
Objects intended to serve as landmarks must be distinctive in several ways. First, they must be distinctive in regard to nearby objects. Second, a landmark must be easy to distinguish from other landmarks, especially nearby ones. Otherwise, a navigator could confuse one landmark with another, and, as a result, select the wrong navigational action (e.g. make a wrong turn). Third, the sides of each landmark should differ from one another. These differences can help navigators determine their orientation.

Guideline 8: Landmark distinctiveness can be increased by placing other objects nearby.

Guideline 9: Landmarks must carry a common element to distinguish them, as a group, from data objects.
Consider VEs whose features are constrained by the underlying data, such as the human circulatory system. Although some of the objects in these VEs can serve as landmarks, it is possible to further assist navigation by augmenting the VE with additional objects that only function as landmarks. However, navigators must easily recognize these objects as landmarks and realize that they are only landmarks.

Combining Paths and Landmarks: Landmark Placement

Guideline 10: Place landmarks on major paths and at path junctions.
Memorability of a building and its position was also affected by the building's location in the environment. Memorability is enhanced when the building is located on a major path or at a path junction.

Building Positions Contributing to Memorability:

  • Located on major path
  • Visible from major road
  • Direct access from street (esp. no plaza or porch)
  • Located at important choice points in circulation pattern

Minimize distortions in cognitive maps, while being aware that the term "cognitive map" is misleading in that it suggests that mental representations of environments are very much like images. In reality, our cognitive maps contain many features that are not image-like. Cognitive maps contain categorical and hierarchical structures and many spatial distortions, some of which cannot even be represented in an image.

Using a Grid

Guideline 11: Arrange paths and edges (see Guideline 2) to form a grid.

Guideline 12: Align the landmarks' main axes with the path/edge grid's main axes.

Guideline 13: Align each landmark's main axes with those of the other landmarks.
To minimize the distortions, the designer must create a VE that induces a hierarchical representation whose districts form a grid. A consequence of the grid's spatial regularity is that the spatial relationships between districts are a good approximation of the spatial relationships between objects in those districts.

5.4 Modality Complementarity

Cross-modal interactions are characteristic of normal perception. Experiencing information in one sensory modality while having congruent, supporting information in another assists us in carrying out our most mundane tasks effectively. Stating such would not be hard to find general agreement on. What might surprise people are some findings by cognitive and neuroscientists about sensory “processing”, to use a computational paradigm expression. While the sensory systems are thought of as being distinct, there are indications that we have our “wires crossed,” though in a beneficial way. For example, PET scans have revealed to researchers that visual processing is not only involved in tactile perception, but is necessary for optimal performance in tactile orientation discrimination. The visual cortex is intimately involved in processing certain kinds of tactile information (Sathian, 2001).

Regardless, there are strong arguments against a seemingly synesthesic mixing or synergistic “whole greater than its parts” sensory processing phenomenon at work in the human mind. Translation of a full meaning equivalence from one sensory modality through to another is debatable. It is unlikely that the combination of hearing and touching can ever replace the sensation of seeing something. Loomis (2003) observes:

If the sensory bandwidth of the substituting sense (or senses) is grossly inadequate, it is simply not possible to carry out the desired function. For example, the informational demands of driving a car are not likely to be met through a combination of audition and touch.

When attempting to optimize the interaction of user with interface, online media content designers have an increasing variety of modalities they can apply. Information is presentable through vision, touch, sound, or preferably, a combination of these sensory modalities. We are challenged with the task of choosing the most apt combination of information channels to communicate a specific message, or, the context for users to forage for what they consider relevant, in the manner they are best suited to. Sheer efficiency, of course, is not the only measure of success. Short of established guidelines for sensory modality transformation, we urge the designer of any kind of online content to employ reiterative, inclusive user testing.

5.5 Appendix: Code Sample for adding both Immersion Web plug-in and user environment detection to HTML.

(Available from Immersion Developer site www.immersion.com/developer/technology/tutorials/index.php, log in required, within the tutorial "TouchSense in JavaScript - Web Tutorials: Detecting Immersion Web Plugin".)

	height="0" width="0">
	 <EMBED type="application/x-Immersion-Web-Plugin" 
	 name="ImmWeb" hidden="true" src="myfile.txt">
	<SCRIPT language="JavaScript">
	<!--hide this script from non-JavaScript-enabled browsers
	var imm_onTouch = false;
	var imm_usrAgt = navigator.userAgent.toLowerCase();
	var imm_isIE = (navigator.appName == "Microsoft Internet Explorer");
	var imm_isNN = (navigator.appName == "Netscape");
	var imm_isIE4plus = imm_isIE && (parseInt(navigator.appVersion) >= 4);
	var imm_isNN45plus = imm_isNN && (parseFloat(navigator.appVersion) >= 4.5);
	var imm_isNN6plus = imm_isNN && (parseFloat(navigator.appVersion) >= 5);
	var imm_isNN7plus = imm_isNN6plus && (navigator.vendor == "Netscape");
	var imm_isWin = (imm_usrAgt.indexOf("win")!=-1);
	var imm_supportsImm = imm_isWin;
	var imm_NNMinPlugin = new Array(3, 2, 3, 0);
	var imm_IEDontAskOff = (IMM_getCookie("DontAskIE")!= "ON");
	var imm_NNDontAskOff = (IMM_getCookie("DontAskNN")!= "ON");
	if(imm_supportsImm)  {
	function IMM_sufficientSystem()  {
	// If Netscape Navigator 4.x, Netscape 7.x or Mozilla 1.x and not Netscape 6
	if(imm_isNN45plus && !imm_isNN6plus || imm_isNN7plus)  {
	 if(navigator.plugins["Immersion Web Netscape Plugin"] &&
		IMM_sufficientNNPlugin())  {
	 else  {    // ImmWeb plugin is not installed
	   if(imm_NNDontAskOff)  {
		 imm_installNow = confirm("You must install the Immersion Web plugin to experience tactile sensations on this web page. Install now?");
		 if(imm_installNow)  {
					alert("Sorry, Gecko based browsers are not currently supported.");
		 } else  {
		   imm_askAgain = confirm("You have chosen not to install the Immersion Web plugin at this time. Would you like to be reminded again later?");
		   if(!imm_askAgain)  {
			 IMM_setCookie("DontAskNN", "ON", "month");
		   else  {
			 IMM_setCookie("DontAskNN", "ON", "now");      
	   } else {
			alert("You're cookie setting is inhibiting you from downloading the Immersion Web Plugin. If you have changed your mind and would like to install, please click on the appropriate link above.");
	// If Internet Explorer 4 or greater
	if(imm_isIE4plus)  {
	 if(document.ImmWeb.ImmWebInstalled)  {
	 else  {    // ImmWeb plugin is not installed
	   if(imm_IEDontAskOff)  {
		 imm_installNow = confirm("You must install the Immersion Web plugin to experience tactile sensations on this web page. Install now?");
		 if(imm_installNow)  {
		 else  {
		   imm_askAgain = confirm("You have chosen not to install the Immersion Web plugin at this time. Would you like to be reminded again later?");
		   if(!imm_askAgain)  {
			 IMM_setCookie("DontAskIE", "ON", "month");
		   else  {
			 IMM_setCookie("DontAskIE", "ON", "now");      
	}   // end function IMM_sufficientSystem()
	function IMM_sufficientNNPlugin()  { 
		// Assumes Netscape plugin description = "Immersion Web Plugin a.b.c.d" where a.b.c.d = version 
		var needDownload = false; 
		var installedVersion = navigator.plugins["Immersion Web Netscape Plugin"].description.split(" ")[3].split("."); 
		// Need Download if installed plugin version < minimum plugin version 
		for(index=0; index<4; index++)  { 
			   if(installedVersion[index] > imm_NNMinPlugin[index])  {
			else if(installedVersion[index] < imm_NNMinPlugin[index])  { 
				needDownload = true; 
		return (!needDownload); 
	}   // end function IMM_sufficientNNPlugin() 
	//****** Cookie utility functions ******//
	function IMM_setCookie(name, value, expire) {
	 var today = new Date();
	 var expiry = new Date();
	 switch(expire)  {
	   case "day":
		 expiry.setTime(today.getTime() + 1000*60*60*24*1);
	   case "week":
		 expiry.setTime(today.getTime() + 1000*60*60*24*7);
	   case "month":
		 expiry.setTime(today.getTime() + 1000*60*60*24*30);
	   case "year":
		 expiry.setTime(today.getTime() + 1000*60*60*24*365);
	 document.cookie = name + "=" + escape(value) + "; path=/; expires=" + expiry.toGMTString()
	function IMM_getCookie(Name) {
	 var search = Name + "="
	 if (document.cookie.length > 0) {                   // if there are any cookies
	   offset = document.cookie.indexOf(search) 
	   if (offset != -1) {                               // if cookie exists 
		 offset += search.length                         // set index of beginning of value
		 end = document.cookie.indexOf(";", offset)      // set index of end of cookie value
		 if (end == -1) 
		   end = document.cookie.length
		 return unescape(document.cookie.substring(offset, end))
	// stop hiding -->

3Steps two and three are adapted from SVG Maps for People with Visual Impairment (www.svgopen.org/2003/papers/svgmappingforpeoplewithvisualimpairments)

4Proceedings of CHI '99, Pittsburgh, PA., May, 1999. Available online through portal.acm.org by search.

5Author's footnote: Note that Lynch refers to these items as "elements" and reserves a specific meaning for the term "landmark".

We acknowledge the financial support of the Department of Canadian Heritage through the Canadian Culture Online Program

Canadian Heritage Logo

horizontal rule the ATRC logo