TRL
TOP PAGETokyo Research LaboratoryEmploymentProjectsRelated InformationIBM Research
Japanese page is here.

Data jewel-box: a graphics showcase for large-scale hierarchical data visualization


Outline

Computer file systems, human resources organizations in large companies, and category-based websites. We can find various kinds of hierarchical data in our daily life. How we can represent an overview of such data by locating whole data onto a display space ? How we can show whole the data like showcases in jewel shops ? ... That is the motivation of our research.

The above figure is an example of our data visualization technique. The technique maps each data to icons (colored spheres in the above figure), and each cluster to rectangles enclosing the icons. A set of nested rectangles represent the hierarchy of the data.

We are studying about fast data layout techniques, so that we realize the above style of interactive visualization technologies. This page proposes the fast data layout technique that uses Delaunay triangular meshes.

Reserach contents

The core technology of this research is the rectangle layout technique that satisfies the following conditions:

  • locates rectangles without overlaps,
  • locates rectangles within a small computation time, and
  • locates rectangles inside a small rectangular area.
This page proposes an algorithm that locates all given rectangles one-by-one, in the order of the area of rectangles, by finding gaps where rectangles can be located,

This algorithm generates a triangular mesh that connects center points of previously located rectangles. First of all, it locates the largest rectangle at the center point of the display space, and generates a triangular mesh that connects the center point of the located rectangle and four points of the area that encloses the rectangle.

The algorithm then locates other rectangles one-by-one, while it finds gaps where the rectangles can be located, by using the triangular mesh. It also updates the triangular mesh by connecting the center points of the located rectangles.

Our technique locates whole data onto the display space, by repeating the above mentioned algorithm from the lower-level of the data to the higher-level.

The feature of the proposed technique is as follows:

  • It is useful to provide an overview of whole lower-level data of hierarchcal data. (On the other hand, it is not always convenient if users first want to visualize higher-level of the data.)
  • It is friendly for people who are not experts of 3D graphics, because it locates whole the data onto a plane.
  • It is useful for multipul attribute values of data, by mapping them to areas, heights, and colors of icons.
  • It can be combined with graph visualization techniques.
  • It required several computation times, but they are not severe since they are shorter than the computation times to read the given data.


Here we show some examples of our visualization.

The following figure is an example of visualization of a website and other webpages directly linked from there. This example represents webpages by using colored square icons. The color denotes the update time of the webpages. Red icons denote new webpages, and blue icons denote old ones. The data includes 835 icons and 160 rectangles. Our technique required 0.3 seconds to locate all data by repeating the above algorithm 160 times. (We used IBM IntelliStation Z-Pro, CPU 933MHz.)

Our implementation simultaneously represents two kinds of attribute values of each data by representing one of them according to heights of icons. The following figure represents the update time of webpages by colors of icons, and the access frequency of webpages by heights of icons.

The following figure is an example of our visualization for a larger hierarchical data. The data includes 5684 icons and 892 rectangles. Our technique required 2.5 seconds to repeat the above algorithm 892 times. (We used IBM IntelliStation Z-Pro, CPU 933MHz.)

It is usually difficult to distribute all data elements of further larger data onto one display space. We developed a visualization prototype for overview and navigation of such very large-scale data as follows.

Please see this image and this image . Bothe images are partial representation of a huge website which has about 10 million webpages.
Our prototype first divides the 10 million webpages into 3259 groups according to their URLs. Dots in the left part of the images represent the 3259 groups.
Clicked one of the dots by a user, the prototype then displays the contents of webpages in the clicked group in the right part of the display space. The above two images show the results of different groups. The prototype displays the groups of webpages in 2 or 3 seconds.
The above prototype realized the interactive navigation and visualization of such large-scale data which includes 10 million webpages.



Back to the top page of this project.

Research home IBM home Order Privacy Legal Contact IBM
Last modified 27 Dec 2001