
|
Kevin Curry |
CS5604, Information Storage and Retrieval |
Introduction
Exploiting the Third Dimension for Information Retrieval
Design Goals and Considerations for a CAVE-ETD
Navigation
Book Population
Browsing and Selection
Dynamic Scalability
Dynamic Environment
Multiple Simultaneous Users
Secure and Automatic Access and Checkout Policies
Subjective Views and User Profiles
Technical Details
Overview of Technical Aspects
Choosing a programming platform
Performer in a Nutshell
Mapping book collection atrributes to visual attributes in CaveETD
User Interaction with CaveETD
System Structure
Navigation
CAVE, Performer and their Interfacing
Creating a Scene
Navigation within the Scene
Collision Detection
Teleportation
BookShelf Population
Bookcases, Booklists and Books
Optimizations
ETD Data extraction
Auxiliary Files and Data Formats
Selection
Functional Overview
Technical Overview
Evaluation
Analysis of Tools
Performer
CAVE Libraries
Coordinate Systems
SGI/CAVE
3-D Design Tools
Analysis of Design Goals
Alternative Interface
Navigation
Switching Modes
Layout of Books
Book Optimization
Preferences
Private Room
User Profiles
Orientation of Books
ETD Data
Conclusion
Further Work
Creation of Performer high-level interface
Customization
New User Interfaces
Direct Information Access
Search Facility
Non-regular Bookcases/Books
Appendix
CAVE Coordinate Systems
The goal of this project was to extend the three-dimensional VRML model of the physical library to a Cave Automated Virtual Environment (CAVE). A prior project had investigated the modeling of a virtual library in VRML – this project attempted to model the same virtual world for use in an immersive virtual reality environment known as a CAVE. The CAVE is a 10 foot square room, with stereoscopic projections on 3 walls and the floor, wherein the user may interact with the world through tracking devices for the eyeglasses and other input devices.
It is believed, though unproven, that inclusion of the third dimension can enhance the process of information retrieval in many ways. A brief discussion about how CAVEs might be used to visualize information retrieval is presented, including the method chosen. The ultimate goal of future work should be to merge these different methods into a single system. Initial design considerations, strategies, and goals for the system, including the choice of API and a hierarchical model are discussed. This is followed by a detailed presentation of the system, or CAVE-ETD as it is called. In the presentation of the CAVE-ETD, those design goals that have been implemented are thoroughly covered. This coverage includes the motivations behind the particular functionality exhibited, an explanation of how the particular functionality fits into the model.
Exploiting the Third Dimension for Information Retrieval
There are essentially two main ways in which CAVEs might be used to enhance the process of information retrieval. One method relies heavily on the vector space model in which documents are represented as points within a volumetric abstraction. Document similarity and co-relevance are conveyed through notions of proximity, density, size, etc. One example of such a system is the Spatial Paradigm for Information Retrieval and Exploration system (SPIRE). SPIRE is an excellent candidate for extension into CAVEs. SPIRE’s Themescapes feature relies heavily on three dimensions and the aforementioned notions but can currently be rendered only on a standard two-dimensional monitor.
The second method of exploiting the third dimension for IR relies heavily on a user’s prior knowledge and experience in a real library. This is the method chosen for this study. In this method, a virtual environment is created that very closely resembles an actual library. "Books" are organized on "shelves", shelves are laid out in "aisles" in a "room", and rooms are arranged in some logical sequence. Books can be browsed on the shelves by "walking" or navigating through the room and reading the titles on the book spines. Because the virtual library is organized in the same manner as an actual library, a simple glance confirms the location of books similar to any book you may currently be examining.
Because the virtual library combines such a familiar concept with the power of super-computing, many simple, yet powerful features can be added that can not otherwise be found in a real library. These include, but are not limited to, end user customization, dynamic scalability, enhanced security, and multi-modal navigation. These and other features are further elaborated upon in the remainder of the report.
Design Goals and Considerations for a CAVE-ETD
During the design phase of the project, it was decided that an ideal CAVE-ETD should support the following, non-exhaustive list of features and functions:
Organization
The library should look as realistic as possible, while still exploiting the advantages of the virtual environment. The look-and-feel should be analogous to a physical library.
Navigation through the library should be intuitive and correspond to well-known interfaces, while improving upon the clumsy joystick embedded within the wand used in the CAVE.
Books contained in bookshelves should be added to the rooms much the same way as in a real library.
Documents in a collection can be scanned as easily as books on a shelf. Browsing in a CAVE-ETD can be enhanced through items such as call-outs (popup text displays) that indicate brief abstracts or updated displays of the current circulation status for a document. Adapting the usage of the input devices to more natural use in selection is a major issue.
The size and organization of the collection can be changed at any time to reflect necessary additions, deletions, and reordering of documents.
The way in which the virtual library appears to the user need not be static. For example, the user or administrator may wish to organize documents on shelves according to multiple parameters, each of which can be changed dynamically. Even color schemes can be modified according to user preferences.
This is a feature of real libraries that might be easily overlooked in the virtual library. This feature will likely require a great number or additional considerations for managing access, subjective viewing, and multiple individual preferences.
Secure and Automatic Access and Checkout Policies
Collections requiring restricted and limited access can accomplish this with secure user authentication. The tagging of documents as "borrowed" can be managed automatically by the computer.
Subjective Views and User Profiles
The concept of subjective viewing deals with the ability to define the way in which a single user views the world without conflicting with the subjective view of other users. This is a key consideration for user customization.
Choosing a programming platform
Contrary to what can be expected, the current state of tools for development of immersive and interactive environments is well behind the sophisticated tools that exist for GUIs. There is nothing for the CAVE that is comparable to those development environments that are usual nowadays on the desktop, where components with specific functions are deployed on the workspace, their behavior can be modified by assigning values to properties, and their system behavior defined by connecting events in a module with event handlers in other modules. All that can be expected today to design an interactive VR environment is some modeling tools to create geometric objects, and a limited menu of APIs and class libraries to build almost all what is needed to make the create a virtual world.
The choices for development tools for this project were:
It was quickly discovered why virtual reality is often more virtual than real: it is a daunting task to build the environment, handle interaction, and modify the environment to react to the interaction. Skillful design of interaction in the CAVE is superseded by the time consuming task of handling low-level events like intersection detection and frequent geometric translations.
The choice of tools adopted was based on an early assessment of learning curve vs. benefits. CosmoWorlds was picked because of the two modeling tools available (the other being Oxygen), it provided all the functionality needed with the most intuitive interface. Performer was picked over OpenGL for the abstract programming interface. Performer is a C++ class library, close to an object-oriented framework, that promised to provide a high-level design environment within which to construct the environment. Performer uses OpenGL as its underlying layer to perform all 3D drawings. In our experience we found Performer barely rises above OpenGL in terms of providing higher-level, more powerful features.
Performer offers a model based on the concept of a scene graph that holds all objects in the world. Geometric and geometry-modifying objects (generically called nodes in Performer) are attached to the scene and to other objects. Spatial transformation objects (called Coordinate Systems or CSes in Performer) are inserted in this tree-like structure to specify the position, size, and orientation of all nodes that are directly below it in the tree. Once a scene is built, Performer handles the drawing of the scene and the updating of dynamics object attributes.
Performer is saddled by the steep learning curve associated with any object-oriented framework (Performer is quite close to one). Deficiencies in the design of the API, inadequate documentation, and conflicts between Performer and CAVE libraries all serve to weaken Performer as a design tool.
Mapping book collection atrributes to visual attributes in CaveETD
The current iteration of the CaveETD design is as follows:
The user starts in the Main Foyer, where there are doors connecting to room. Each room represents a subsets of the available collection. These subsets are eight, one for theses and dissertations of each college in Virginia Tech. The user can freely move among the foyer and the rooms. All data required to create a room and fill it with documents are read in at the moment the user enters the room. A network connection could be established to retrieve the books, but currently they are stored in local files.
Once inside a room, the user sees bookcases organized in rows. Books are sorted according to some ordering criteria (fixed to author’s name in this implementation) and arranged in a left to right and top to bottom order in a bookcase. Rows of bookcases display the books according to the sort criteria from the front to the back of the room. Each book spine size is linearly proportional to the size of the document it represents. The book color represents a unit in the sort criteria. In the current implementation books are sorted by author, so book color is currently assigned by a hash function from the authors’ names. This way, documents with similar authors will be of similar color.
The main interaction device presently used in the CAVE is the wand. The wand is far from direct and sometimes less than intuitive, thus presenting an additional source of complexity to the virtual world design problem. The wand (see figure below) is a 3-D input device that offers 6 degrees of freedom by handling the wand, 4 degrees of freedom by manipulating the joystick on the wand, and 3 buttons.
|
|
Altough the wand offers sufficient degress of freedom to implement most of the movement that a human being can do, this mapping is far from natural. Many CAVE users report problems using the joystick in conjunction with moving the wand and the accuracy of detection of the wand position seems not to be uniform across the CAVE volume.
Perhaps due to these problems, but surely due the immaturity of this technology, there seem to be no conventions for implementing interaction in the CAVE. While GUIs are all more or less based on the same set of well-defined actions, there are a multitude of interpretations for various combinations of wand movement, joystick movement, and button pressing in the CAVE.
With this in mind, it had to be decided:
The most important concern was to design a simple interaction, even if it contributed somewhat to the complexity of the program. Right from the beginning it was decided to discard the joystick, and use only the wand itself as a pointing device. To move around, the angle of the wand with the XY plane would control the speed and direction of movement and the angle with the YZ plane would control the rotation angle. Since the walking was to be handled through the wand, a method had to be devised to point to particular books and retrieve information. The joystick could have been used for this purpose, but pointing at a book, as if reaching for it, was considered to be more intuitive than moving a pointer with the joystick. This decision necessitated two user modes: Movement and Selection. In Movement mode, the wand controls walking as explained before; in Selection mode, the user’s position is fixed and the wand controls the movement of the selection pointer (a "hand" graphic in our case). Button 3 was assigned to the task of switching modes. In the tasking of approaching books and browsing them, the user walks until close enough to explore a bookcase. When the user is close enough to read the words on the spines of the book, he or she switches to Selection mode to browse the books in the bookcase. When functionality was added to show the abstract of the selected book, an additional button was used to control the displaying or hiding of the abstract. In summary, the design for interaction is as follows:
Two Modes: Movement (default) and Selection.
In Movement mode:
Movement speed and direction = wand angles
Switch to Selection Mode: Button 3
In Selection Mode:
Book selection = wand position
Display/Hide abstracts: button 2
Switch to Movement Mode: Button 3
The internal structure of the CaveETD program is as follows:
|
|
The left side of this scheme (initialization, main loop and clean up) is contained in the caveetd.c++ file. This module initializes the CAVE library and data structures used by the program. After creating the main foyer, where the user starts to move around the library, the main loop executes until the user exits through the Exit door in the foyer or presses the ESC key.
When the third wand button is pressed and the user enters selection mode, the main loop passes calls the selection handling routine. A mode flag is set to 1 to indicate that selection is active. If mode flag is 1 the HandleSelection() routine is called. If selection mode is not active, and the user is free to move about, adjustteleports() is called to check for possible intersection with a teleport, and adjustbumps() is called to calculate the new user position, and prevent the user from walking through bookcases and walls.
All the code to handle book selection and the drawing of abstracts is contained in the select.c++ file. The HandleSelection() function first checks for a button 2 event, which, when detected will call either ShowAbstract() or HideAbstract(), depending on the current state of the abstract display. Finally, HandleSelection()checks to see if the wand ray is intersecting a book; if so, it highlights the book. Book detection is performed by the doselect() and dealWithPicking() functions. The doselect() and dealWithPicking() functions traverse the scene graph, checking for items with an intersection mask set (bookcases) and collecting points in space where these items were intersected. Only the first intersection is considered. doselect() passes the intersection point to the bookAt() functions, defined in bookcase.c++, which determine the book that has been selected. The returns this book to the main HandleSelection() routine, which in turn calls the highlight and call-out functions.
The task of room creation is shared among the createFloor() routine in caveetd.c++ and putbooks() in putbooks.c++. The room geometry is assembled from a pre-designed set of wall and floor objects that createFloor() loads. The names of each room are also drawn on the back walls of the rooms. fitShelves() calls putbooks() to create and fill the bookcases. putbooks() reads the designated book collection, and uses an instance of the bookcaseFactory class to create all bookcases needed and fill them with books. When putbooks() returns, fitShelves() positions every bookcase in the room. fitShelves first calls putBooks to create a linked list of bookcases with the books added to them. It then reads the definition file ("objects") for a list of coordinates at which to position the bookcases. Bookcases are added to the scene until either the linked link reaches an end or the definition file is exhausted. The format of the definition file is:
x-coordinate y-coordinate z-coordinate rotation
The file is terminated by the value 1000. As each value bookcase is inserted, the attributes are also updated to allow for intersection detection, needed for selection.
CAVE, Performer and their Interfacing
The CAVE libraries have a well-defined interface. There is a set of function calls that have to be performed in a particular order in order to initialize the hardware and start the processing of the graphics. In general, the main program starts with a set of function calls and ends with a set of cleanup routines. Between these two is a central loop, which calls three CAVE routines specific to the Performer libraries (pfCAVEPreFrame, pfCAVEPostFrame and pfFrame). pfFrame is the routine that draws the current scene graph in the CAVE. Thus, within this loop, the user can manipulate the scene graph, changing its objects or their locations. The scene graph is always drawn at the center of the CAVE coordinate system, which lies on the floor at the center of the CAVE.
The scene graph is constructed each time the user enters a new room. The old scene graph is disposed of before constructing the new one. Since this occurs between frame refreshing, it has no impact on the visuals in the CAVE. Using separate scene graphs for each floor, and keeping only one in memory at a time allow Performer to handle a minimum number of objects, being only those that are currently visible. The root node of the scene graph is always a DCS (dynamic coordinate system). This is so that the room can be repositioned according to where the user is within it. Thus, the user does not move within the room – the room moves around the user. Of course, the ultimate effect is exactly the same. The objects of the current scene are added to the root DCS. The light used is an ambient light and this is added to the root DCS. The walls and floor, and foyer object are added to the root DCS with intermediate DCSes that position the elements appropriately. The definitions for these objects are read in from a definition file ("room"), with the following format:
x-coordinate y-coordinate z-coordinate rotation x-scale y-scale z-scale object
(note that in this discussion, x refers to the first coordinate in Performer, y the second and z the third). Each line of the file contains coordinates and scaling for an object, which is then read in from the specified file. The object is placed into the scene preceded by an appropriate DCS. Again, termination is signaled by the value 1000 on the last line of the file.
CAVENavTranslate is called to translate the world so the user appears at an appropriate starting location instead of the center of the room. The teleporting portals are added by calling fitTeleports. FitTeleports reads a definition file (either teleports.foyer or teleports.floor). Each set of coordinates is read in and a portal is added to the scene at that position. The format of the file is:
x-coordinate y-coordinate z-coordinate rotation teleportnumber
The teleport number allows the program to distinguish between portals when the user walks into one. The portal numbers are:
|
No |
Meaning |
|
2 |
Foyer |
|
3 |
Agriculture and Life Sciences |
|
4 |
Arts and Sciences |
|
5 |
Engineering |
|
6 |
Human Resources and Education |
|
7 |
Architecture and Urban Studies |
|
8 |
Pamplin College of Business |
|
9 |
Forestry and Wildlife Resources |
|
10 |
Veterinary Medicine |
|
11 |
Exit from program |
The final scene graph for a floor of the library looks as follows (the dashed portion is only present in selection mode):
|
|
The scene graph of the foyer is similar, except that there are only portals and the walls and floors are a single object instead of a list of objects. Cardinality of containment is 1 unless explicitly indicated to be different.
Navigation is accomplished by checking the coordinates of the wand once every time the scene is redrawn in the main loop. The coordinates are returned by a standard CAVE function. If the wand if moved sideways, the amount of movement is translated into an amount of rotation. This is restricted so that moving the wand more than 90 degrees in either direction parallel to the front wall will not result in any more than the maximum amount of motion. In order to scale the linear coordinates returned such that the amount of rotation depends on the degree of rotation, a logarithmic formula was used:
log (-log (w[0]-0.1))
Translation was tied in to the tilting of the wand in the plane parallel to the side walls. Tilting the front of the wand towards the floor moves the user forward and tilting the wand away from the floor moves the user backwards. This also used a similar logarithm formula to make the amount of tilting result in a varying degree of movement.
Both of these navigation elements have cut-off values so that very slight movements of the wand would be totally ignored.
Finally, the new coordinates are applied to the root DCS so that the effect of the navigation is applied to the world, as perceived by the user.
Every solid object in the system is surrounded by a pfBox which indicate the bounds of the object. This includes the walls, bookcases and any future objects that ought to implement collision detection. Any object of this type is derived from a Performer class created for this express purpose, called pfBump. pfBump contains virtually no functionality except that it has a pfBox as an element.
Whenever Performer traverses the scene graph, it calls a callback function (checkbump) on each object derived from pfBump. These objects then check to see if the user has stepped into the object by checking if the coordinates of the user’s head lie within the bounding box around the object (extending vertically to infinity). If so, a flag (bump) is set and the callback function returns. This flag is interpreted on the next iteration of the loop in the main program, where the changes to the users’ positions are reversed so that the effect of the movement is undone. It should be noted that the user is prevented from moving too close to an object by using an offset from the actual bounds. This is because the position of the user’s head is not the same as the position of the user’s eyes, making it possible for the user’s head to be within the room while the eyes are without.
The advantage of reversing the user’s movements instead of just going to the old position is that it also caters for the possibility of the user physically walking into an object within the CAVE. In such situations, it was decided that it would be more normal for the world to move away from a user than for the user to walk into a solid object. Thus, movement is tracked at the level of the user’s head position in the world instead of the current transformation applied to the root DCS.
Teleportation is the ability to move to a different room. To do this, the user walks into a portal object. These are derived from the pfBump class since the behavior is similar. This time, a different flag in set in the callback function (teleport). The teleport flag is set to the value of the portal as specified when it was added to the scene. Then, during the next iteration of the main loop, the teleport flag is checked and the scene is changed to the one specified by the flag.
Bookcases, Booklists and Books
The crux of the bookcase population routines is in the putbooks module. First the library object is used to read in a list of the books (titles, authors, abstracts and sizes) from the specified data file into a booklist object. bookcaseFactory is an object that lays the books out on the shelves in the standard order from left to right and top to bottom, adding more shelves as needed. It is initialized with the information needed for the bookcase so that a single bookcase object can be "cloned" instead of copied. Then, the booklist is passed to it as a parameter and it creates the geometry for books, divides books into shelves, adds bookcases and essentially creates the entire Performer scene sub-graph associated with the books.
Although the machine running the CAVE has an impressive amount of processing power, image rendering and world management consumes much of that power. From the beginning it was recognized that there was a need to optimize the way the library was built and displayed.
A major objective was to simplify the scenes. Two big sets of objects were candidates for optimization: the books and their titles. The bookcases viewed in the CAVE contain what was termed "fake" books. Instead of having separate book geometry for every book in the collection, what was actually done is illustrated below:
The first illustration indicates the regular scenario, with separate objects for each book.

In the current implementation, this was replaced with "book blocks" for a whole set of books on a shelf, thus eliminating the need for additional planes which were not visible but which would have added to the complexity of the scene for rendering purposes.

The second set of objects that needed to be optimized for rendering were the book titles. Since every letter in every book title is a separate geometric object to render, having 100 books easily creates more than 1800 objects that need to be rendered. Their impact on the scene performance was immediately noticeable. Since the resolution of the CAVE is limited, it was decided not to show the titles that were farther from a certain distance from the user. Every book title was inserted within a Level of Detail (LOD) Performer object. LODs are able to hold multiple representations for an object, and display one depending on the distance to the observer. LOD was used as a threshold, showing the title only if the user is at a median distance from the books. This strategy worked surprisingly well - no user noticed the fact that the titles were not visible due to the existence of an artifact (the LOD) and not due to the resolution of the CAVE.
ETD data was extracted from an SQL dump of the ETD database.
Two commands were used:
1. grep "INSERT INTO etd_main" etd_available.dmp > data1
2. split.pl
data1 contains only the SQL records for each etd submission.
split.pl parses the SQL dump file (data1), extracts the relevant records and classifies them according to the college. It produces a set of 8 text data files named etddata.1, etddata.2, etc. The format of each record in these files is as follows :
<name of etd>
<name of author>
<size of document>
<abstract>
Each field is on a separate line. There are no intervening carriage returns or linefeeds and there are no record separators. The size field is generated randomly in the range 100-800 since there is no size information in the ETD database. Some of the data in these fields contain non-std characters. These characters are ignored in CaveETD if their ASCII values are less than 32, but otherwise these will show up in various different ways, depending on the font character mapping.
Auxiliary Files and Data Formats
CaveETD use some auxiliary files that must be in the same directory with the main program. These files are:
Name: case.def
Purpose: Defines physical measures of a bookcase.
Format: This file is made of 12 floating point numbers in ASCII, that specify:
Path of the .pfb file where the bookcase geometry is stored,
width of the bookcase,
height of the bookcase,
depth of the bookcase,
spacing (height of a shelf without considering the width of the shelf itself),
number of shelves,
thickness of the bookcase wall
Relative boundaries of a bookcase, in the following order: xmin, xmax, ymin, ymax, zmin and zmax.
Name: etddata.1,…etddata.8
Purpose: Holds the data for the book collection of every college. The number specifies which college the file refers to, in the following order:
|
Agriculture |
1 |
|
Arts and Sciences |
2 |
|
Engineering |
3 |
|
Human Resources |
4 |
|
Architecture |
5 |
|
Business |
6 |
|
Forestry |
7 |
|
Veterinary |
8 |
Format: These files consist of text fields delimited by carriage returns. Each set of 4 lines represents a document. In each record, the information is held in this order:
Title
Authors
Size
Abstract
Name: case.pfb, cube.pfb, foyer.pfb, teleport.pfb, brickwall.pfb, room.pfb
Purpose: Pre-designed objects containing geometry, and references to textures, which are used as building blocks along the program.
Format: See Performer Reference Guide for pfb formats.
Once the user has entered a room and begins to walk about it, he or she immediately sees the collection of documents organized as books on shelves. By walking up to a bookcase, the user can easily read the truncated book title located on each book’s spine. The user can continue to scan the spines of the books by walking down the aisle. When the user comes across a section of books he or she wishes to explore in further detail, he or she can gain access to more information about each book by switching to selection mode. Selection mode is activated when the user presses the third wand button once and it is confirmed when the user sees a virtual pointer extending from the wand.
While in selection mode, the user can manipulate the wand to point toward and "select" any book on the shelf. When the pointer intersects with a book on the shelf, that book is highlighted in red and its complete title and author are displayed. An additional feature enables the user to quickly view the abstract corresponding to the book being highlighted. While in selection mode and pointing at a book, the user can press wand button 2 to display a text call-out containing the document’s abstract.
The following figure illustrates the calling hierarchy for the HandleSelection() routine, which is contained in the files select.h and select.c++.
|
|
Just before HandleSelection() is first called, the detection of a button 3 event first calls a function that initializes all objects and data used to display the call-outs, highlights, and abstracts. InitializeSelection() creates DCSes for the highlight, callout, and abstract "window", "hides" them by positioning the coordinate systems under the floor, and attaches them to the scene. A corresponding CleanUpSelection() function is called that removes any objects added to the highlight, call-out, and abstract DCSes and deletes the DCSes as well.
While selection mode is active, HandleSelection() is continually called by the main loop. HandleSelection() immediately checks for a button 2 event to determine if an abstract should be shown or hidden (This would require that a book was selected on the previous call to HandleSelection()). After the event check, the functions doselect() and dealWithPicking() are called to handle intersection detection. doselect() returns a pointer to the DCS that holds the selected book. The book and the scene are then passed to HighlightBook() and ShowCallout(). HighlightBook() adjusts the size of the highlight based on the width of the book and attaches the highlight to the room at the location of the book. Attaching the highlight to the room sets the highlight’s position to be relative to the book. ShowCallout() reads in the name and author of the book and attaches them as pfText objects directly to the scene. Attaching the call-out directly to the scene sets the call-out’s position to be relative to the viewer. When a button 2 event is detected, either ShowAbstract() or HideAbstract() is called, depending on the current state of the abstract window. ShowAbstract() also receives a book and the scene. The routine reads the abstract information from the book and attaches it to the abstract DCS. The abstract DCS is then brought into view of the user. Both ShowCallout() and ShowAbstract() rely on a small parsing routine to convert text from String objects to pfText objects.
Additional features can be added to this simple interaction style that could serve to enhance the CAVE-ETD experience. For example, the title and author call-out could be replaced with a menu of options - view abstract, view text, check-out book, and information, an updated list of books currently checked out by you, and due dates for books checked out by others.
The initial objective was to create a visualization system for information, while exploiting the unique immersive characteristics of the CAVE. After much debate, it was agreed upon by all that the utility of such an immersive implementation would lie largely in the ability to present and customize information to the user in ways that are not possible through conventional 2-D interfaces. To this end, the project was divided into stages, the most important separation being that between the basic features and the customization. At the termination of the project’s initial life cycle, only the basic features were included. There were many reasons for this, mostly related to the level of complexity of the platform on which the project was implemented, and the non-negotiable design decisions that increased complexity. Each of these is considered in the following discussion.
At the very outset, it was decided to use the IRIS Performer library instead of the OPENGL library to create and manipulate the graphics environment. This decision was taken solely on the basis of the expected and implied ease of use associated with Performer, because of its scene graph layer of abstraction over the graphics primitives. However, it rapidly became apparent that this was not sufficient in order to create complex 3-D graphics in the CAVE. While Performer provided the ability to interpret a scene graph and render the primitives, most other tasks involving graphics had to be coded in a manner almost identical to the OPENGL library usage. For example, the creation of a single word of text in the CAVE requires manipulation of multiple Performer objects. In addition, the text needs a font definition, and these can either be hand-coded (vertex-by-vertex) or loaded in from a font definition file. Only two sample fonts were distributed with the system, the assumption being that high-end users would either define their own fonts manually or subscribe to developer services that distributed such fonts. Many problems with the Performer library had to be resolved through trial-and-error simply because there did not exist sufficient documentation to provide insight into some of the problems experienced.
As an example of things that Performer should be able to carry out but does not, consider a cube that you would like to prevent intersections with. You would expect to execute a command such as:
cube.setSolid(true);
However, in Performer, the programmer has to attach a box around the cube, tell Performer to detect the intersection with the box, and add a callback function to be called when the intersection occurs. Further, it is the programmer’s responsibility to figure out how to avoid the intersection.
As another example, consider building of a geometric object. A programmer would expect to create the object, attach points or faces to it, and specify their colors. To accomplish this in Performer, a GeoSet (geometry) must be created, which must be inside a Geode (scene node), whose multiple and complex attributes still need to hand-despite being fairly common.
The CAVE libraries were fairly stable, working as expected and providing all the necessary functionality without the need for much, if any, low-level code.
A major source of confusion was the incompatibility between coordinate systems. This manifested itself in sign differences as well as range and positional differences between the CAVE navigation, CAVE wand, Performer and CosmoWorlds. CosmoWorlds was used to create some of the basic objects loaded into the scene. It was found experimentally that 1 metre in CosmoWorlds on some computers corresponded to 1 foot in the CAVE after conversions. Under different circumstances, it was discovered that the ratio was 10 metres to 1 foot. After conversion to Performer files, the objects acquired a fixed size, but the ratio had still to be determined experimentally before each session.
The SGI hardware was fairly adequate as a programming platform. The major problem with hardware was the inaccessibility to the CAVE hardware. This resulted in almost all of the development being done using a simulator. The few practice runs conducted in the CAVE confirmed that the look-and-feel of the interface was different from within the simulator. The textures of certain parts of the scene were not rendered properly by certain walls of the CAVE and as far as could be identified, this seemed to be a problem with the graphics hardware and not with the software. The actual wand also differed from reality in that the positioning was not as accurate as on the simulator, even yielding the wrong sign at particular positions in the CAVE.
Another issue that could not be solved was the inability of the CAVE libraries to display their output remotely on a X-Server running on Windows 95, even with all the additional OPENGL libraries included. This was apparently due to the attempted use of a non-standard remote procedure call and it effectively restricted the sites at which development could be done.
IRIS Showcase, Oxygen and CosmoWorlds were all tested. The first two produced native IV (Inventor, which Performer could understand) files. However, they were inadequate because the former had no means of building precisely measured objects and the latter was more of a scene graph editor rather than a visual tool. The third tool was chosen because it could be used to design precisely positioned and scaled objects. In the process of using CosmoWorlds, the issue of coordinates, as discussed above, had to be resolved first. In addition, the only output format of CosmoWorlds was VRML2, and at this point it was discovered that Performer could not load in multiple VRML2 files simultaneously. Like all other problems encountered in the use of Performer, this was not documented, and a conversion program had to be found to convert from VRML2 to a native Performer format.
One of the basic parts of any VR program is its navigation component. Most of the existing CAVE programs make use of a little joystick embedded within the CAVE wand to achieve navigation. This seemed particularly difficult in reality because of the physical construction of the joystick. To avoid the usual clumsy interface, gesturing movements of the wand were used to indicate movement forward, backwards or rotation in either direction. This makes it somewhat easier to move around. A direct implication of this feature is that the user had to switch between navigation and selection modes in order to prevent moving the user in the world when selecting books out of immediate reach. Even though 2 buttons were eventually used, the principle of keeping the interface simple was met adequately.
In trial runs in the CAVE, it was noticed that new users may have initial difficulty adjusting to the interface. Although the interface corresponds most closely with that used in most 3-D games, the requirement that the wand be in a horizontal forward-facing position to remain stationary is not natural. This would be difficult to resolve without having to resort to use of the joystick. It was also pointed out that the ability to side-step would be very useful to browse through books on shelves.
Many CAVE visualization applications do not sufficiently stress the importance of the VR environment being realistic. Since this was one of the design goals, it implied that a user ought not to be able to walk into objects. This was implemented so that even if a user physically walked into an object within the CAVE, the world would readjust to prevent such abnormalities. The philosophy behind this was that it would be more realistic to have the room shift slightly than for the user to walk into a solid object.
To minimize the amount of complexity of the interface and allow the user to select books using the same wand movements as navigation, there are two modes of operation: navigation and selection. To keep things simple, these were implemented as a toggle switch attached to the third button on the wand. A similar approach was taken to the displaying of the abstracts. It has been noted that the placement of the title text above the user’s head can be distracting and it should be placed nearer the actual books. Also, the abstracts needed resizing. Both of these features can easily be corrected with access to the CAVE hardware.
It was decided to lay out books for a single college on shelves in a grid format on the floor to make the library look as close as possible to a regular library. A feasible alternative would have been to divide the books up into smaller units (eg. departments) – this would have made navigation faster. Another option would be to allow the user to translocate directly to the point of interest in the library using a navigation control device. Another issue not addressed was the labelling of shelves, which would definitely be needed for large collections, but was not critical for out application.
In order to reduce the complexity of the scene as far as possible, the books were not modelled as single objects – instead the entire row of books was one object. Thus, there were significant savings in terms of book ends not having to be part of the scene. The top of the books was also a single plane. To avoid relocation costs in the scene graph, the books were offset directly from the DCS of the bookcase.
While these were considered vital to justify an immersive implementation, there was not enough time to implement preferences. They can be incorporated by having a room for preferences lead off the foyer, thus not impacting the core program too much.
This was not implemented because selection and states of books were not implemented. In order to have a private room of books, it is first necessary to have the ability to remove a book (or get a copy) from the shelf, and mark each book with a reference counter.
User profiles were not implemented, but they could be used to customize the layout of the library for individual users.
The books were placed horizontally of the shelves just as they appear in a regular library. However, midway through the project it was realized that even this was not necessary since the bookshelves did not inherit the problems associated with removing and re-shelving books in a vertical stack. A vertical stack would, however, allow easier reading of the titles. The bookcase was designed, but because of the vast amount of code changes, this was not included.
ETD Data was initially obtained in the form of a database dump from the library. Originally it would have been preferably to have live access to the data over a network, but since speed was an issue in the CAVE, the data was pre-processed and stored on disk. The data was organized into a regular text file and optimized by removing parts that would never be needed.
During preprocessing it was discovered that people used many different forms of the names for their departments. Some people used names that were not correct and others used names that have subsequently been changed. Thus, during processing of the data, it was necessary to manually derive a classification of departments into colleges.
The project was successful in creating a prototype model of a digital library in the CAVE immersive virtual reality environment. It has shown that it is possible to exploit the 3rd dimension to help visualize information. Many people argue that it is not necessary to use VR techniques for digital libraries because search engines and regular information browsers work just as well (if not better). However, on a philosophical level, the only reason why people use the current types of information retrieval tools is simply because there exists no alternatives. If there were alternatives that were closer to reality, they would probably be preferable, since the learning curve would be less steep for non-technophiles. This project has shown that it is possible to model information browsing in a 3D environment. However, lots more work needs to be done to create better interfaces for information retrieval in such environments.
The major drawback was the inadequacy of the tools. Any future work would depend on the state of development tools. As far as hardware is concerned, it was discovered that the CAVE hardware is still not fast enough for the visualization of information because of the complexity involved. Even while using lots of optimization techniques, the CAVE slowed down substantially because of the level of complexity of the book objects, especially because of the text titles on the spines. For realistic visualizations, the hardware will have to be faster or the data will have to be split into smaller collections.
Creation of Performer high-level interface
Since the single most inhibiting factor to the development of this project was the graphics library used, any further development would benefit most from a better library. It would seem advisable to either build a library from scratch using OPENGL as its underlying layer, or build a simpler layer on top of Performer. Features of this library should include the ability to model objects as complete entities with collision detection (prevention), the ability to be selected, manipulated (resized, moved, etc), created and destroyed without having to be concerned with memory management. Also, the coordinate system of a high-level layer could perform all the necessary conversions to synchronize with the CAVE without any explicit conversions.
At the very outset, it was determined that the utility of this project would be greatly enhanced it the user could do things not possible in a traditional library. This would include the ability to sort book collections or subsets or even reorganize the entire library. Thus, a user using the Library of Congress classification could order the whole library according to this scheme while a user who is accustomed to the Dewey Decimal System would likewise have the same ability, without affecting the base collection of data stored in the library. This would also require the storing of user profiles for each user. Other obvious customizations could include the ability to change the number, height and/or orientation of the shelves to suit different people, changing the surrounding scenery of the library so that it would be more pleasant for the user. The sizes of text can be scalable for people with differing eyesight. Even the wand interface can be replaced by any of the other interfaces as preferred by the particular user.
The CAVE has many emerging interfaces besides the wand. The glove is a good candidate for inclusion in digital library simulation because the user can then reach out and "touch" books in the library. Another possibility is the future inclusion of force-feedback devices. These would allow the user to "feel" the books or know when they have walked back into a wall.
In order to make this system feasible for constant usage (eg. within a library), the access to information has to be live. Instead of downloading the data and pre-processing it, the system should interact with a live database over a network. Thus, as soon as a book is inserted into the master collection, it would appear in the library. Of course the library should not change while being viewed by the user – but these changes can be immediate upon the next visit to the relevant subsection.
Traditional information retrieval include a front-end that usually consists of a search engine. This may be very useful in this context as well. Thus the user should be able to search for data before browsing in the CAVE. To incorporate searching into the CAVE environment would require either a keyboard or voice recognition interface to be provided within the CAVE.
As previously mentioned, vertical bookcases would be very useful as the user would not have to strain his or her neck to read the titles. Also, it may be possible to make the titles longer if the books are not created similar to real books – they may be much longer. While this is somewhat different from traditional libraries, it is not a distortion of reality. Such improvements should be exploited for their ability to benefit our activities without incurring the real-world problems.
This appendix is added to help future developers to understand the coordinate transformation inside the CAVE-ETD code. Performer, the CAVE library, and CosmoWorlds give different meanings to the vector components (number in parenthesis indicates the index of a component in the vector):
|
API or Application |
Coordinate System Ordering and Orientation |
|
Performer and Wand |
|
|
CAVE Library |
|
|
CosmoWorld |
|