31 May, 2005 at 18:10 Leave a comment

Visualization of Abstract Information

©Copyright Rolf Daessler, 1995. This document may be reproduced in whole or in part provided that this copyright notice is reproduced on each copy made.
[seems like what we are doing ourselves – the EAM network – falls under this banner]


Abstract. This article will discuss recent concepts of visualization of abstract information contained in document databases. The increasing information glut and the advances in data access on digital networks have created a demand for new concepts to retrieve information. The information visualization based on the advances in computer graphics seems to be the key concept for the design of future user interfaces. New prototypes of information visualizers open new dimensions of user-interaction such as the navigation in a virtual information space and structural browsing. Three different data-link models such as information network, information tree and hypertext are discussed. Prototypes of existing information visualizers such as the Information Visualizer© and FSN-File System Navigator© are examined. Hierarchical data structures allow for a global structural browsing of large collections of multivariate data by exploration of virtual information landscapes. Hypertext browser on the other hand are efficient to retrieve specific information in local information domains and became a standard browser in the World Wide Web. A combination of both might be a powerful concept for the next generation of user interfaces in information retrieval systems.

The tutorial covers the following topics:

  1. Introduction
  2. Information Retrieval versus Information Visualization
  3. Data Models
  4. Prototypes of Information Visualizers
  5. Summary
  6. References

Introduction


Never before we were confronted with such an information glut. In the information highway environment we have access to an explosive amount of data and information. In the near future nearly all information contained in data bases, digital libraries and other massive data collections will be available on the Internet. The question how to benefit from these new technologies leads to the question how to find specific information. One of the most powerful concepts is to use the visualization paradigm for navigation, retrieval and access of information (1). Current advances in computer graphics hardware and software have created the base for a new generation of user interfaces as well as for information retrieval systems. The evolution of the Human-Computer-Interface (HCI) is strongly coupled with advances in computer technology (Figure 1).

Figure 1 Evolution of User Interfaces.

In a text-oriented environment the user communicates by a command language and in retrieval systems by a query language. Most of the recent on-line databases or digital libraries are still working in this environment. On the other hand Graphical User Interfaces (GUI’s) allow for advanced user interactions. The Windows-Menu-Point and Click interfaces are available on nearly all platforms. This environment allows for hypertext and multimedia applications. The virtual space concept opens additional options as visualization of large data structures, navigation in data space and interaction with objects. Most of the recent progress in visualization techniques has occurred in the field of scientific visualization. However, the concepts of physical data visualization are different from the information visualization. The representation of abstract data needs to create a Virtual Reality (VR). The widest acceptance will probably find a virtual space with a real-worlds analog that mostly support our cognitive communication abilities. One of the key problems the definition of an abstract data model the base for the information space. The creation of an information space can be more difficult than a scientific visualization because the information space can be multidimensional and the kind of data representation depends much more on the specific nature of the information. To understand the information space it is important to see the relationships between objects in a semantic context. One of the most sophisticated concept is undoubtedly the application of the Virtual Reality paradigm to create and explore the information space.


Information Retrieval versus Information Visualization


Information retrieval is concerned with the representation, storage, organization and accessing of information items (2). The advantages in high speed storage devices and networking have led to an imbalance in information sciences. Because information produces more information the problem is that we have an increasing amount of data and information to which accurate and speedy access is becoming more and more difficult. Today there is a need for information management that can match information storage capabilities. The basic retrieval process is quite simple. Information is stored and later retrieved when it is needed. On-line retrieval systems typically consist of a large document database. Terms that describe the document contents (index) are selected from manual or automatic indexing. The index terms are descriptors of the represented document. Queries are requests to process information and a search query consists of different terms combined in a structured query language. The traditional information retrieval paradigm is being a matching process according to the similarity between the keyword index entries and the search query. The problem is to find all and only the relevant documents. To evaluate the retrieval results two paradigms are used: recall, the percentage of all relevant documents found and precision, the percentage of the documents found that are relevant. From the evaluation of the retrieval results one can formulate a new query search. The traditional model of information retrieval shown in Figure 2, however, has some strong limitations.

Figure 2 Information Retrieval System.

The retrieval process might be very inefficient if the user vocabulary does not match the index vocabulary and the user can not explore parts of the document space by a context-oriented search path. An information visualizer as user interface could help to overcome these problems. As shown in Figure 3 the abstract data model represented by the index is visualized as an information space.

Figure 3 Information Visualizer.

The user now has different ways to communicate. Navigation inside the information space is helpful following a context-oriented search path to find certain domains of interest. On the other hand a query leads to a certain position in the information space and a context-oriented exploration of related information is possible. The information visualization paradigm will improve both the recall and the precision of the retrieval process at all.


Data Models


To visualize information it is necessary to define an information space. Information spaces are abstract and differ from physical data spaces that typically have a spatial mapping. The information space of a document data base is usually determined by the topics of the documents included. Each topic is represented by a subset of certain keywords (descriptors). Links between documents can be defined if different documents are described by the same keywords. All possible links create a network of document-term relations in the document space (Figure 4).

Figure 4 Document-term network. Topics (blue nodes) and characteristic terms (red nodes) are connected by common terms (green links).

The type of network in Figure 4 describes the importance of a term with respect to a document. This is only one aspect of possible semantic relationships between the topics. However, the visualization of this network becomes very complex with an increasing amount of documents. The structure and contents of such a network would be very difficult to understand and explore. In this case the visualization would not help to simplify the data model. A better approach is the visualization of trees with a hierarchical data structure. Because tree models are well understood, nearly all of the recent information visualizers use this data model. The document-term network is projected to a topological tree structure, which can be visualized in the 3-D space. Figure 5 shows a transformation of the network of Figure 4 to a tree with the starting point of document A.

Figure 5 Document-term tree starting from topic A. Because of the specific tree structure some links will not be visible (hidden links).

Dependent on the starting point it is sometimes necessary to duplicate nodes or to create hidden links to reproduce all the relations of the network. Hence one problem of tree structures might be the amount of redundant information, or alternatively, a loss of connectivity. Another problem is the automated generation of tree structures which needs in general complex classification procedures. There are textual, statistical and linguistic methods approaches A common method is the cluster analysis (3) of the document space where each document is represented by a vector of characteristic terms (Figure 6).

Figure 6 Information space represented by a multidimensional vector space spanned by the set of term vectors.

Tree structures give the best opportunity for a structural search following a context-oriented search path. The classification of the documents and the hierarchical structure make it easy to browse the information space and finally to find information domains of special interests. A different approach is the hyperlink model. Figure 7 shows a transformation of the network of Figure 4 into a hyperlink net that starts with document A.

Figure 7 Document-term hyperlink net starting from topic A. Alternative links make it difficult to find all related information.

This model allows for a fast retrieval of a subset of related documents. Hypertext is the preferred concept for WWW-Browsers on the Internet. However, it might be difficult to find a complete information because there is the possibility for missing alternative links. The lack of any classification structures makes the search for specific information to a walk through a labyrinth. On the other hand there is a high probability for redundant links. The visualization of the link structure in this particular case includes the same problems as the visualization of the document-term network in Figure 4. Hypermedia structures are in principle unable to provide an overall understanding of an information space.


Prototypes of Information Visualizers


In science and engineering visualization is a common way and in many cases the only way to understand large data sets. Although much work has been done using 3-D graphics to visualize physical entities, only a few retrieval prototypes take the advance of 3-D visualization or Virtual Reality-techniques to visualize abstract data. However information visualizer are able to display structural relationships in a semantic context what is rather difficult to obtain in the traditional retrieval process (Figure 2).

Perspective Wall and Cone Tree (XEROX PARC, XSoft)

Perspective Wall and Cone tree are metaphors for 3-D visualization of linear and hierarchical abstract data respectively. Both visualization techniques use interactive animation to explore dynamically changing views of information structures (4). They are included in the commercial software product Visual Recall© (XSoft, 1994) which supports the management of large file collections. The main problem in visualization of linear information structures is the accommodation of the extreme information aspect ratio on the computer monitor. A common technique for this particular problem is the integration of a detailed and contextual scale-reduced views. Figure 8 shows the Perspective Wall representing different types (authors) of document files and a file chronology.

Figure 8 The Perspective Wall in Visual Recall© represents vertically different types (author) of document files and horizontally the file chronology according to the creation or last modification date

The perspective wall has additional features for user interaction. The wall moves a selected item into the center panel with a smooth animation. The user can adjust the ratio of detail and context and a document browser allows for the inspection of each item keeping the context view of the selected item.Hierarchical data models represent appropriate structures for visualization and navigation. Cone trees are hierarchies laid out uniformly in three dimensions to minimize the size of the visualized structure and to enable a view of the whole data structure. Figure 9 shows a cone tree visualization of the file hierarchy containing the same files as presented in the perspective wall (Figure 8).

Figure 9 Cone tree visualization of a directory hierarchy in Visual Recall©

The cone tree visualization shows the classification of the documents according to the structure of the filesystem. A typical search will find all items related to a selected item. The user can rotate the tree, to bring a special item to the front. A document browser allows for the inspection of selected items as well as for the perspective wall. Cone trees are a common technique to visualize hierarchical structures of abstract information in 3-D (5).

3-D Information Landscape© (Silicon Graphics)

In 1992 Silicon Graphics introduced the paradigm of Information Landscapes (5) for managing large collections of hierarchical structured data such as computer filesystems (FSN-File System Navigator©). An information landscape is created by 3-D bar charts that are connected by a topology on an extended landscape plane (Figure 10).

Figure 10 3-D FSN-File System Navigator© (Silicon Graphics). The two panels show the same UNIX-filesystem from different perspectives. The virtual landscape is built by cells (directories) containing data blocks (files). The volume of the data blocks and pedestals represents the size of the files or the cumulative size of all files in a directory respectively. Multidirectional connection lines between cells show the topology of the filesystem. The spotlighting in the upper panel marks a selected file and moves the object of interest in the foreground

The basic visualization objects are data blocks (files) and container objects (directories). Multidirectional connection lines between containers represent the contextual relations (directory hierarchy). The data blocks represent a variety of object properties that are mapped into graphical attributes (color = age, volume = size, icon = type, text = name). Perspective produces a fisheye effect that allows for a detailed view on the information in the foreground in the context of a global background. The virtual information landscape simulates a natural cognitive space for the users. Advanced navigational techniques allow for a flight through the virtual space and a multimedia browsing of selected documents. The information landscape concept is one of the advanced graphical user interfaces for any kind of hierarchical data structures.


Summary


Increasing amounts of data and information and the unlimited access to information in advanced network environments such as the Internet have created the need for a new generation of user interfaces for querying, accessing and retrieving information. Major commercial on-line information services and network engines use keyword matching or text-oriented query languages for information managing. On the other hand, advances in graphics technology have created new possibilities for Graphical User Interfaces and Virtual Reality concepts. The information visualization seems to be the key concept for the design of future use interfaces for information retrieval systems. To visualize information it is necessary to define an virtual information space. Information spaces are abstract and differ from physical data spaces that typically have a spatial mapping. Tree structures give the best opportunity for a structural search following a context-oriented search path. The classification of objects and the hierarchical data structure make it easy to browse the information space and finally to find information domains of special interests. Perspective Wall and Cone tree are examples for 3-D visualization metaphors of linear and hierarchical data structures. These visualizations use interactive animation to explore dynamically changing views of the information structures. Hierarchical data structures allow for a global structural browsing of large collections of multivariate data by navigation in virtual information landscapes. Hypertext browser on the other hand are efficient to retrieve specific information in local information domains and became a standard browsing tool in the World Wide Web. A combination of both paradigms might be powerful concept for the next generation of user interfaces for information retrieval systems. Although there exists a number of prototypes (6) using an information visualizer for user interactions the visualization paradigm is until now not implemented in commercial retrieval systems. The following reasons might be possible explanations for this lack in popularity.

  1. The command oriented query languages are sufficiently efficient to search for abstract information, there is no need for advanced visual retrieval techniques.
  2. The visualization of abstract information bases on a structured (hierarchical) data model. However the automatic classification of abstract information including semantic aspects is still unsolved and needs further research. On the other hand current data base systems allow for the management of more complex structures (relational, entity-relationship, object-oriented) than tree structures.
  3. Until now the application of advanced interactive computer graphics have required advanced and expensive hardware capabilities.
  4. The existing prototypes of information visualizers are still to complex and need to become more user-friendly. The user does not see a qualitative improvement in the costs for information retrieval.

However, there is the chance, that information visualizer for abstract information will become the same acceptance as browsing tools on the Internet. The key question is how to use graphics technology to simplify the human computer interaction for a wide popularity. A step in this direction might be recently the introduction of Webspace© an interactive 3-D user interface for popular Web browsers such as Netscape©(Figure 11).

Figure 11 Netscape© World-Wide-Web Browser with Webspace© 3D-Viewer. The 3-D navigator in Webspace© allows for a variety of navigational operations. The virtual space can be explored from different perspectives.

The 3-D viewer includes the same graphical techniques as the File System Navigator©. This will probably help to increase the popularity of the Virtual Reality paradigm. On the other hand the Webspace© viewer supports the Virtual Reality Modeling Language (VRML), an open architucture, platform-independent file format for 3-D graphics on the Internet (Figure 12) similar in concept to the Web text standard Hyper Text Markup Language (HTML).

Figure 12 The Virtual reality Modeling Language (VRML)

These concepts open new possibilities to visualize abstract information in the information highway environment.


References


  1. Information Visualization: The Next Frontier, in Computer Graphics Proceedings, Annual Conference Series, 1994, pp. 485-486.
  2. Salton G. and McGill M.J. Introduction to modern information retrieval, McGraw-Hill, New York, 1983.
  3. Salton G. Automatic Text Processing: The transformation, analysis, and retrieval of information by computer. Addison Wesley Publishing Comp., 1989.
  4. Robertson, G.G., Card, S.K., Mackinlay, J.D. Information Visualization using 3D interactive animation, Communications of the ACM, Vol.36, No.4, 1993, pp. 57-71.
  5. Tesler, J. and Strasnick, S. FSN: 3D Information landscapes. Man pages entry for an unsupported but publicity released system from Silicon Graphics, Inc. Mountain View, Calif., 1992.
  6. Fairchild K.M. Information Management using Virtual Reality-based Visualizations in Virtual reality: Applications and Explorations, Academic Press Professional, Boston, 1993, pp.45-74.
Advertisements

Entry filed under: Uncategorized.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed


Calendar

May 2005
M T W T F S S
« Apr   Jun »
 1
2345678
9101112131415
16171819202122
23242526272829
3031  

Tweets


%d bloggers like this: