Supporting Collaborative Awareness in Tele-immersion

Kevin M. Curry

kcurry@csgrad.cs.vt.edu

Dept. of Computer Science, Virginia Tech

VT-CAVE, University Visualization & Animation Group

2000 Kraft Drive, Suite 2400, Blacksburg, VA, USA 24061

phone: 1 + (540) 231-2066

ABSTRACT

At the Virginia Tech CAVE™, we perceive a significant need to address fundamental human factors and usability issues in the system. Particularly, as the desire to collaborate across CAVE™ networks increases, we are focusing on the user’s primary needs for working with others in a virtual environment. Drawing influence from research in Computer-Supported Cooperative Work (CSCW), we step into the phase beyond performance and look to support interaction, communication, and collaboration. A major component of Tele-immersive collaboration, especially in large-scale, potentially crowded, virtual environments is collaborative awareness. This paper presents some views on awareness and how it can be supported. Thus, the discussion is divided into two parts. The first part presents a general understanding of collaborative awareness, as well as particular components which seem to contribute to awareness. This discussion combines lessons derived from the literature and findings from my own experiences as a Graduate Research Assistant at the VT-CAVE™.

Part two presents an implementation of a user interface which supports, among other features, devices and utilities for collaborative awareness. These items include: three-dimensional and flat radar, a participant list, a status panel, highlighting, tethering, and multiple views. They are implemented as Performer-based objects and functions, created for LIMBO [Leigh URL 1]. The end result is the CAVE™ Collaborative Console, which is also introduced but continues to be developed. This application was developed as part of a team effort with Kent Swartz and John Kelso, also with VT-CAVE™. The idea of the Console began from a need to not only address issues of awareness, but to somehow provide an interface that gave the user control over his or her tools. But true to the Object-Oriented Paradigm, the Console is evolving into something virtual and need not even exist, from a user perspective, in order to be used. The latest iterations of the software design (v0.5) treat the Console as a communications manager between LIMBO and whatever devices the Console is managing. This approach has had the effect of freeing us, and hopefully others, from the unstable input problem. The new method is less invasive of the core LIMBO code and should give T-I applications developers the flexibility to choose their own input devices and to easily install their own collaborative support tools.

INTRODUCTION

The overall concept of Awareness is, perhaps, the most fundamental aspect of tele-immersive interaction. Without some minimal indication of other collaborators’ states within the environment and of your surroundings, it is extremely difficult to establish even a foundation for any kind of interaction. Actually, awareness is viewed as a set of issues that concern one’s knowledge about the state of other collaborators within a virtual environment and of the environment itself.

Collaborative awareness has long been an issue for the greater discipline of Computer-Supported Cooperative Work (CSCW). In their 1991 paper, Groupware: Some Issues and Experiences, Ellis, et. al., alluded to many elements, which is now place under the larger heading of awareness. These include: "group focus and distraction issues", and "notification". Benford and others [Benford, et. al., 1994, Bowers, et. al., 1996] later looked at these issues as they applied to collaborative virtual environments (CVEs).

Tele-immersion is now becoming a powerful medium for collaborative interaction and is being investigated and deployed by an increasing number of organizations worldwide [URL 2]. Many of these facilities are being designed for use as collaborative engineering environments. To goal is to enable scientists to interact and design, in CVEs, as if co-located. The reasons motivating this type of initiative should now be well known in the immersive technologies community. As Tele-immersion plays a greater role as a form of "groupware", so, too, will awareness play a role in Tele-immersive, collaborative, interaction.

As an augmented reality, Tele-immersion has the potential to support awareness both through natural and artificial means. But before these means can be discussed, we must arrive at a greater understanding of how this awareness is expressed and realized.

Awareness can be achieved by numerous means and through various media, each of which may have individual or combined, isolated or global implications. This seems especially true in Tele-immersion, which seeks to provide more "natural" forms of awareness. Awareness can be either mutual or one-way. It can have a wide or narrow scope of interest and extended or limited range. A participant’s desire to be noticed can be enhanced or restricted, thus increasing or decreasing the likelihood that other’s will be aware of that participant. One model that accurately captures these nuances for VR is the Spatial Model of Interaction [Benford, et. al. 1994, Greenhalgh and Benford, 1997].

The Spatial Model mediates interaction based on the physical properties of space. Thus, the abilities to see and to hear are affected by distance, direction, and possible obstruction. The main concepts of the Spatial Model are:

Medium the mode of communication; text, audio, video, etc.

Aura the portion of space for which interaction is enabled and allowed

Focus a representation of the "observing object’s interest in a particular medium"; what the observing is trying to perceive

Nimbus a representation of the observed object’s desire to be perceived [in a particular medium]

Adapters objects which can modify aura, focus, and nimbus in order to customize interaction between participants

Extending the Spatial Model

Based on our experiences at the CAVE™, it is perhaps necessary to explain the assumptions, modifications, and extensions made here with regard to the Spatial Model.

It has been observed that in some sense, all objects in a virtual environment can have their own aura, focus, and nimbus. According to Benford, et. al., interaction between two objects is not defined until their auras collide. The term "interaction" may be taken to mean any sort connection or communication between objects. Interaction may be as simple as mere sight of another object, or it may be as complex as a face-to-face conversation. Once collision of two or more auras has occurred, each object’s own focus and nimbus determine, respectively, the degree to which they can be aware of other objects and the degree to which other objects can be aware of them. This potential for awareness is emphasized because it is possible for someone with a large focus to be unaware of someone with no nimbus. Likewise, someone projecting a large nimbus may go unnoticed by someone with no focus toward that nimbus. In accordance with what we expect in the physical world, aura, focus, and nimbus are all defined based on the nature of the medium. For example, most people can see much farther than they can hear, and so the default aura for a graphical medium is expected to be larger than the default aura for an audio medium. Again, the "default" is used to remind us that we are dealing with a computer-mediated environment and that it may be desirable to artificially manipulate some particular component of our awareness. Indeed, there need be no restriction on focus that requires it to be in the same location as one’s presence. This makes it possible to share views or move one’s view to specific locations in space. Similarly, privacy can be maintained by combining a small or non-existent nimbus with a limited focus. Finally, the use of adapter objects, such as a virtual microphone, make it possible to temporarily increase one’s nimbus in any particular medium. The result is that others temporarily have a greater ability to perceive you through that medium.

Gradient Awareness

Awareness is treated as a scaleable concept comprised of multiple components in multiple media. Put simply, if a participant is able to perceive all possible awareness cues in an environment, from all possible sources, then that participant is said to be totally aware. If a participant does not or cannot perceive any awareness cues, despite the transmission of such cues, then that participant is said to be totally unaware. The Spatial Model certainly allows for gradient awareness in the sense that focus and nimbus can be dynamically redefined by the user. But each medium may have multiple parameters which contribute to one’s level of awareness in that medium. Thus, it is possible for someone to be totally aware for some media and totally unaware for others. Likewise, it is possible for someone to be partially aware in all media.

Within the context of total and partial awareness is the user’s level of global awareness. Global awareness means that the user is able to perceive any part of the world, from any perspective, at anytime. A user who has encapsulated awareness is restricted to awareness of particular area in the environment. The user’s control over his or her levels of awareness is sometimes referred to as Interest Management [Abrams, et. al., 1998]. Much of what is known about interest management speaks to issues of performance. This paper is among the first to discuss interest management from a user perspective. Awareness, particularly aura, focus, and nimbus, can be supported by auxiliary interface devices, working in multiple media.

Awareness is divided into 4 main categories: Action Awareness, Attention Awareness, Location Awareness, and Presence. These are described below. There is an important fifth component which follows separately as "Views".

Presence

Presence may be the most elementary component of virtual collaboration. The actual term, "presence", has been invoked under a broad definition, however. Generally, presence is the feeling of "being there". In the context of the environment, presence deals with the degree to which you feel a part of some virtual space; that the space exists and you are occupying it.

In the context of collaborators and objects, presence is mostly and issue of embodiment or some peripheral indicator (participant list, radar, etc.). This kind of presence provides the sense that other collaborators, and even other objects, share the environment with you. One must know that other collaborators exist before they can be included in any form of interaction. Basic indications of presence are not difficult to achieve. A simple list of each participant’s name is enough to tell you that others share your environment. Likewise, a textual description of the world around you is an indication of presence (i.e., MUDs). These methods are computationally inexpensive and satisfy the basic need. But textual information does little more to qualitatively reaffirm the presence of space and others who may being sharing that space with you.

In Tele-immersion, which is specifically built according to a supercomputing and very-high bandwidth architecture, presence rises to a new level. Tele-immersion places the user "inside" the environment and the way the user views the environment is much different than other kinds of CVEs. Accordingly, other collaborators in the environment can be represented by sophisticated embodiments that are unique to immersion. Avatars, graphical representations of remote collaborators, are examples of embodiments which establish a strong sense of presence. Avatars can be as simple as GIF images of the other participants, or they may one day be as detailed as live, three-dimensional video stream images of an actual person.

Through various levels of detail, particular attributes of the avatar can be used to stimulate different notions of presence. In tele-immersion, we can simply look at another avatar and instantly determine many things that otherwise might require complex and artificial indicators in more traditional groupware. If a remote user is distant from us in the virtual world, then her avatar, obeying the natural laws of space and light, appears small and difficult to see clearly. As the user moves closer, or we move closer to her, the avatar looks correspondingly larger and other details of appearance become clearer. When the avatar is standing directly in front of us, we see a life-size embodiment almost as if the person were there with us. These same things are also true for the environment. Standing at the base of a virtual 40-story building has quite a different effect than reading a description of the same scene, while perhaps viewing an image of the building.

Sound is also an important component to presence. In face-to-face interaction, we communicate by speaking. What we hear is strongly coupled to what we see; the hands waving, the lips moving, the eyes glancing around the room. All of these things together contribute to a greater feeling of shared space. In tele-immersion, the third dimension allows us to exploit sound even further. Already, much has been done to combine multi-channel sound with volumetric, virtual space so that users correctly perceive sounds based on distance, direction, and amplitude.

Such advances are not without limitations, however. A live, three-dimensional video stream, for example, is likely to require more bandwidth than is currently available for practical uses. More generally, there should be some relationship between the complexity of the avatar and the task at hand, so that a minimum set of elements are used to effectively convey presence [Leigh, et. al., 1997].

Still, as avatars more closely emulate humans, they have the potential to support other forms of action and attention awareness. Through the use of motion tracking devices, avatars have been used to convey simple gestures like eye gaze and hand-waving [Benford, et. al, 1994 Leigh, et. al, 1997]. Benford’s first implementation of MASSIVE showed how visible changes to an avatars appearance provided cues that related the remote users ability to communicate via two different media. When the remote user was able to hear what was spoken, his head-shaped avatar grew ears. When the remote user was able to communicate verbally, his avatar grew a mouth [Benford, et. al., 1994].

Action Awareness

Action Awareness concerns the ability to know what other collaborators are doing. To some extent, different looking avatars can be used to convey different actions of remote collaborators. Avatars can also be animated with motion feedback from tracking devices. In this way, it is not difficult to show when someone’s head is turned or in which direction they might be pointing.

Physical Gestures

Physical gestures play an important role in actual, face-to-face interaction. Shrugging one’s shoulders, waving one’s hands about while talking, and the many other elements in the vocabulary of human body language are indispensable in this type of interaction. Likewise, physical gesturing can play an important role in Tele-immersion. Indeed, there is reason to believe that gesturing can be especially useful in Tele-immersion. It has been demonstrated that head position and orientation, body direction, and hand position and orientation are enough to convey important physical gestures [Leigh, et. al., 1997].

Other Changes in an Avatar’s Appearance

MASSIVE demonstrated, among many things, that simple changes in an avatar’s appearance provide a sufficient means for viewers to determine which communications channels are available to a remote user. Leigh, et. al. further demonstrated that other changes could be made to the avatar to convey that remote user’s actions. They created two different concepts, "mortal" and "deity", which showed that even basic avatars could support action awareness through presence . While having other uses, the two modes effectively convey a cue about another collaborator’s action state by changing the size of that user’s avatar.

Depending on whether a participant needs to make gross or minute manipulations to the environment, he or she will switch, respectively, between the deity mode and the mortal mode. The mortal mode presents the environment to the user in fine detail, as if he or she is in the world and a part of it. This allows the user to manipulate small objects. The deity mode presents the environment in gross detail, as if the user is looking down on the world. This allows the user to manipulate large objects, and even the entire environment. Accordingly, a change in mode is reflected in a change in the size of the avatar that other collaborators perceive. Mortals appear to be "life-size", while deities are clearly much larger-than-life in size. This drastic change in presence indicates to other collaborators a change in action from minute manipulation to gross manipulation.

Attention Awareness

Attention awareness deals with a person’s focus and range of perception. In simple collaborative exchanges like on-line chat or UNIX talk, it is essentially impossible to determine a remote collaborator’s focus of attention. The impact of this depravity on the collaborative session was illustrated in a study of MASSIVE [Bowers, et. al., 1996]. Through Conversation Analysis, Bowers, et. al. showed how a lack of perceptible cues caused confusion in the session when one member diverted his attention to a colleague in the real world.

Providing perceptibility of another person’s focus of attention seems to one of the more desired, yet allusive, goals of collaborative VR. At a basic level, some attention awareness is provided by voice alone. Attention awareness increases through the use of avatars, which can be animated by motion tracking sensors. Attention might be even more acutely perceived if eye tracking sensors are also used, but studies are certainly needed to determine what is necessary and enriching.

Challenges for Attention Awareness

The ability to resolve the state of the remote user in a timely manner was shown to be of further significance in Bowers, et. al.’s study of MASSIVE. In this study, two particular incidents occurred frequently, which demonstrated this point. In the first incident, collaborators in the virtual world found it difficult to determine when their colleagues turned attention to collaborators in the real world. When a person shifted his attention to a real-world distraction, no indication was given the virtual world that this was so. Consequently, remote collaborators would carry on as if the person was still there; unaware that this person had shifted his attention. The second incident was referred to as "corpsing it". In this case, a remote collaborator’s network connection would go down but his avatar would remain active in the virtual world. Those remaining in the virtual world were often found trying to interact with the "ghost" avatar.

Location Awareness

Location awareness is knowledge of both your position in the virtual world and the position of others in that world. Location awareness is one of the more easily implemented attributes of collaborative interaction. A person’s position in a virtual environment is nothing more than three floating point numbers. This information can be used to provide simple locators, such as blips on radar, for finding other collaborators or objects.

An Experience with Location Awareness

At Supercomputing ’98, in Orlando, Florida, I was fortunate enough to participate in a global Tele-immersion demonstration [Leigh and Johnson, 1998]. During this demonstration, I shared a large, multi-room CVE with as many as 10 remote collaborators physically located at different points around the globe. Each "room" of this VE, or "atrium" as is was called, was a separate demonstration showcasing a different achievement in Tele-immersion. The lack of, and need for, location awareness was quickly apparent. Most of the atrium’s rooms were also large-scale, multi-room models. Only the contribution from the VT-CAVE™ included support for location awareness. Upon entering an immense model of a Silk Road shrine, each at different times, several of us were left blindly wandering around in search of, and calling out to, one another. To overcome this frustrating problem, we were forced to exit back into the atrium and agree to pass through the portal in quick succession. Fortunately, we could exit quickly with the press of a button. In essence it was like resorting to the classic "reset" to resolve a problem. Upon entering the VT-CAVE™ model, on the other hand, users were greeted by heads-up displays of both a participant list and a radar. The participant list made is easy to tell who was in this part of the environment and how "far" away they were. At any time we were not able to see a remote user’s avatar, we could easily locate that person with the help of the radar.

Views

A person’s literal point of view has been a driving source of inspiration for research in Tele-immersion. As a visualization medium, Tele-immersion provides unique frames of reference for viewing graphical data. As a collaborative medium, Tele-immersion can further leverage this advantage. The need to support individual and subjective views for each participant has been recognized for some time in CSCW [Snowdon and Jää-Aro, 1997]. Subjective views are, in fact, the natural mode for human-to-human interaction; everyone sees through his or her own eyes. Such is the case in Tele-immersion, but because Tele-immersion is an augmented reality, it has the ability to support many different points of view.

Camera Positions and Locations

Just as any large building may be equipped with a surveillance system, a large-scale VE or model might include multiple camera positions and locations. Camera positions typically allow a user to view an object or location from different directions and orientations (e.g., top, front, back, left, right). "Cameras" can be placed at fixed locations in the environment, or can be specified on command. Cameras are used in most 3D CAD interfaces and might similarly be useful in Tele-immersive design.

Shared Views

Shared viewing is literally seeing through another’s eyes. In CSCW, shared viewing is a classic example of WISIWYS. One or more connected users sees the world in the exact direction, orientation, and manner as the person whose view they are sharing.

In some sense, the concepts of shared viewing and multiple camera locations begin to blur the line between awareness and navigation. WISIWYS guided touring and other methods of leading someone through an environment combine subjective and shared viewing with navigation and movement. Likewise, location awareness provides useful information for navigating a virtual environment. Tele-immersion, as an augmented reality, provides further support for navigation and movement through methods which are not new to either hypermedia or CSCW.

Configuration of the Collaborative Session in Tele-immersion

In cases where "true" Tele-immersion is required, multiple collaborators share virtual environments from networked CAVE™S. But the restriction of 3D, virtual, collaborative sessions to CAVE™s is neither practical nor probable. In fact, there number of ways to configure projection-based systems and even SGI workstations to facilitate collaboration in 3D virtual environments. To some extent these methods can also provide the sense of "being there." The most common configurations are among CAVE™s, Immersa-Desks (I-Desks), and CAVE™ Simulators running on workstations.

CAVERNSoft and LIMBO

The CAVE™ Collaborative Console was designed for use with LIMBO and CAVERNSoft [Leigh and Johnson, URL 3] from the Electronic Visualization Laboratory. More information can be found at [URL 3].

The CAVE™ Collaborative Console

The CAVE™ Collaborative Console extends CAVERNsoft and LIMBO by providing users with a suite interface devices and functions that are needed for an efficient and effective collaborative session. The application is intended to be somewhat generic and flexible and yet is implemented with many utilities that support awareness, presence, and collaborative manipulation. Though I recognize the multi-disciplinary uses for Tele-immersion, I believe that all users will require at least a base set of collaborative utilities and tools to work cooperatively with multiple participants in most virtual environments. By designing a flexible interface, the console can be tailored to specific domains of use through minimal end-user programming. The console uses both existing tools (i.e., Media Recorder, FTP) and tools which were created at the VT-CAVE™.

 

Object-Oriented Design

The CAVE™ Collaborative Console was designed under the Object-Oriented Paradigm (OOP) in C++. It is primarily an Iris Performer package of connected components called ConsoleItems [Fig. 1] . Interface items are the only features of the console which rely on this binding to Performer. In the current version (v0.5), each awareness device is such an object.

 

A Word About Interaction in Tele-immersion

It is highly desirable that all collaborative functions be accessible while the user is still immersed. At the beginning of this research very few collaborative tasks could be performed that did not require the user to return to the terminal. While the wand has been available, limits to its role as a robust input device exist. Modeled after a mouse [Pape, et. al, 1996], it is essentially destined to serve as a point-and-click or grab-drag-and-drop device. In most cases, a single mode is mapped to a single button. Consistencies among applications in the relationship between buttons and modes simply do not exist. I believe this "gulf of execution" [Norman in Carroll, 1991] has existed due to a lack of powerful design tools for building interaction into an environment. This seems natural, given the relative immaturity of Tele-immersive technologies, and the situation is beginning to improve.

Nonetheless, I have viewed the input problem as an unstable one and have consequently designed the CCC to work independently from the type of input device. The only constraint is that the input hardware’s signal must be translatable into the set of legal console commands. For example, if the task is to toggle a radar so that it can be seen, it matters not whether the show(radar) function was invoked by a button-press, fingering combination from a pinch glove, or a voice command.

 

Console Objects

The CCC was also designed to be somewhat flexible on the component side of the interface. The only constraint is that new ConsoleObjects be Performer-based objects. I believe that this constraint can be eliminated in future versions of the software.

FloorPlanRadar- represents the environment as a plane, centered around some pre-defined location in the virtual world (e.g., (0, 0, 0)).

ThreeSpaceRadar - represents the environment as a sphere, centered around some pre-defined location in the virtual world (e.g., (0, 0, 0)). By capturing the z-axis, the spherical model relates the space above and below the radar’s center. In some cases, however, only a "floor plan" view of the world is needed.

Participant List - a list of everyone sharing the environment. Other collaborators are displayed by name and distance. For example, "Mary: 25.5" would indicate that Mary shares the environment and that she is 25.5 feet/units away. Also: Object List.

Console Utilities

shareView - strict WYSIWIS view as if looking directly from a point of view relative to the other person’s shutter glasses.

teleport - allows users to hyper-navigate to specific locations. If the location is that of another collaborator, permission to enter that space may be required. If the movement is to a location in the world, permission is only required if the location intersects with another person’s space.

tether/detach - (attach | follow) - bind your location relative to another participant or object at a specified distance. The actual location of your avatar and your view will be behind and to the right. The participant or object to whom you are attached will be displayed as text on the CAVE™’s master wall. All active awareness devices will uniquely distinguish that participant or object from other objects and participants. A tether is indicated by a cord that attaches you to the participant or object. The length of the cord can be changed using adjust. A tether is released using detach. [adjust - change the length of a tether cord to a specified length.]

view - view a specified object or absolute world location from a specified camera_position. camera_position can be: FRONT | BACK | LEFT | RIGHT | TOP | BOTTOM. The camera_position is also displayed as text on the CAVE™’s master wall. If the object being viewed is listed on the Participant List and/or shown in the radar, those indicators are uniquely distinguished from other objects.

 

Comment Concerning Future Work

The CAVE™ Collaborative Console is finally approaching a state which will allow for much needed usability assessment. Essentially nothing concrete is known about user preferences for each of the different radar, for example.

As was mentioned at the beginning of this paper, the CCC continues to be developed. Many other features are being included which leverage collaborative methods with which user’s are likely to already be familiar. Utilities like FTP and mail clients are being integrated into the virtual, immersive interface. Multiple methods of "discovery recording" [Leigh, et. al., 1997] have are becoming available.

For further information regarding the CAVE™ Collaborative Console, or the VT-CAVE™, please refer to the URLs included at the end of the reference section.

REFERENCES

Abrams, H., Watsen, K., Zyda, M., Three-Tiered Interest Management for Large-Scale Virtual Environments, VRST 98, November 1998, Taipei.

Benford, S., Bowers, J., Fahlèn, L., Greenhalgh, C., Managing Mutual Awareness in Collaborative Virtual Environments, Proceedings VRSTí94, August 1994.

Bowers, J., O’ Brien, J., Pycock, J., Practically Accomplishing Immersion: Cooperation in and for Virtual Environments, Computer Supported Cooperative Work í96, 1996.

Carroll, J., Designing Interaction, 1991, pp. 17-36.

Finholt, T. A., Olson, G. M., From Laboratories to Collaboratories: A New Organizational Form for Scientific Collaboration, Psychological Science, Volume 8, Number 1, January 1997, pp. 28-35.

Goldin, D., Tools for the Future, Remarks Delivered at the Annual NASA Employee Conference, January, 1998.

Greenhalgh, C., Benford, S., Boundaries, Awareness and Interaction in Collaborative Virtual Environments, Proceedings of the 6th International WET-ICE, June 1997.

Johnson, A., Leigh, J., DeFanti, T., Brown, M., Sandin, D., "CAVERN: the CAVE™ Research Network,." Proceedings of 1st International Symposium on Multimedia Virtual Laboratory. March 1998, pp. 15-27.

Leigh, J., Park, K., Kenyon, R., Wong, H., Preliminary STAR TAP Tele-immersion Experiments between Chicago and Singapore, Proceedings HPCAsiaí98, 1998.

Leigh, J., DeFanti, T., CAVERN: A Distributed Architecture for Supporting Scaleable Persistence and Interoperability in Collaborative Virtual Environments, Journal of Virtual Reality Research, Development and Applications, Vol. 2.2, December 1997 (1996), pp. 217-237.

 

 

Leigh, J., DeFanti, T., Johnson, A., Brown, M., Sandin, D., Global Tele-immersion: Better Than Being There, Proceedings of ICAT '97, Dec, 1997.

Snowdon, D., Jää-Aro, K., A Subjective Virtual Environment for Collaborative Information Visualization, Virtual Reality Universe ’97, April 1997.

 

REFERENCED URLs

  1. http://www.evl.uic.edu/cavern/cavernsoft/
  2. limbo.html

  3. http://www.ncsa.uiuc.edu/VR/cavernus/

users.html

  1. http://www.evl.uic.edu/cavern/

 

OTHER URLs

CAVE Collaborative Console http://bleen.sv.vt.edu/~kcurry/ccc.html

VT-CAVE http://www.cave.vt.edu