information visualization
根據Bertin[5],圖學(graphics)包括至少兩種的用法,一是當某人已經了解某些資訊時,做為傳播這些資訊方法;另一則是當某人為了解這些資訊時,對圖形物件的操作與覺察。這兩種的用法不應混淆。
先前有關資訊視覺化設計空間的研究,包括Keller[1]列舉出科學視覺化(scientific visualization)的技術,Chuah and Roth[2]對資訊視覺化的任務提出分類架構,Shneiderman[3] 提出一個的資料型態-任務的矩陣做為分類架構。本研究整合上述的研究,並參考 Bertin[5, 6] 和 Mackinlay’s [7]對圖形的符號學(semiotics),提出資訊視覺化設計空間的架構,如下圖所示。
根據Bertin[5],圖學(graphics)包括至少兩種的用法,一是當某人已經了解某些資訊時,做為傳播這些資訊方法;另一則是當某人為了解這些資訊時,對圖形物件的操作與覺察。這兩種的用法不應混淆。
先前有關資訊視覺化設計空間的研究,包括Keller[1]列舉出科學視覺化(scientific visualization)的技術,Chuah and Roth[2]對資訊視覺化的任務提出分類架構,Shneiderman[3] 提出一個的資料型態-任務的矩陣做為分類架構。本研究整合上述的研究,並參考 Bertin[5, 6] 和 Mackinlay’s [7]對圖形的符號學(semiotics),提出資訊視覺化設計空間的架構,如下圖所示。
在上面的架構裡,資料部分包含原始的資料D,經由某一個轉換函數F過濾或重新編碼得到的新資料D'。視覺化包含標記(mark, M)、控制處理(controlled processing, CP)、視網膜特性(retinal property, R)和空間上的位置(XYZ)和時間(T)等部分。互動技術部分則有觀看技術(view techniques, V)和介面小工具技術(widget techniques, W)。資料的類型可以是名義型(Nominal)、順序型(Ordered)以及數量型(Quantitative),其中的數量型資料也包括空間資料以及地理上的座標。視覺化基本上由一組的標記以及它們的視網膜特性和空間與時間的位置組成,標記可以為點(points)、線(lines)、面(areas)、表面(surfaces)或體(volumes),視網膜特性則包括了顏色(color)、大小(size)、連結(connection)和閉合(enclosure)。
Our analysis builds on recent attempts to understand parts of the design space.
Keller[1] lists techniques used in scientific visualization.
Chuah and Roth[2] taxonomizes the tasks of information visualization.
Shneiderman[3] proposes a “data type by task” matrix.
Our analysis is closest in spirit to Tweedie’s [4], who also starts from Bertin.
Our analysis starts from an expanded version of Bertin’s [5, 6] and Mackinlay’s [7] analysis of the semiotics of graphics.
Graphics, according to Bertin[5], have at least two distinct uses, which should not be confused: first, as the means of communicating some information (in which case a person already understands the information) and second, for graphical processing (in which case a person uses the manipulation and perception of graphical objects to understand the information).
The major distinction we make for data is whether their values are
Nominal (are only = or ≠ to other values),
Ordered (obeys a < relation), or are
Quantitative (can do arithmetic on them).
We denote these as N, O, and Q respectively.
In a more detailed analysis, we would also note the cardinality of a variable, since one of the points of information visualization is to allow visual processing in regions of high cardinality.
We distinguish subtypes of Q for intrinsically spatial variables Qxy and spatial variables that are actually geophysical coordinates Qlon.
We also distinguish between data D that is in the original dataset from data D’ that has been selected from this set and possibly transformed by some filter or recoding function F.
Human visual processing involves two levels: automatic and controlled processing[8].
Automatic processing works on visual properties such as position and color. It is highly parallel, but limited in power.
Controlled processing works on abstract encodings such as text. It has powerful operations, but is limited in capacity.
An elementary visual presentation consists of a set of marks (such as Points, Lines, Areas, Surfaces, or Volumes), their retinal properties (such as Color and Size), and their position in space and time (such as the XY plane in classical graphics and XYZT or 3D space plus time in information visualization). We also include, following [7], the properties of Connection (denoted “—”) and Enclosure (denoted “[]”).
Thus, visualizations are composed from the following visual vocabulary:
Marks: (Point, Line, Area, Surface, Volume)
Controlled Processing Graphical Features
Automatically Processed Graphical Properties
Retinal encodings: (Color, Size, Shape, Gray-level, Orientation, Texture, Connection, Enclosure)
Position: (X, Y, Z, T)
We focus here on two interactive techniques: View techniques (such as focus+context), which distort the space-time of the visualization, and Widget techniques, which add user interface objects (such as buttons) to the visualization.
Symbol | Meaning |
D | Data Type ::=
|
F | Function for recoding data ::=
|
D’ | Recoded Data Type (see D) |
CP | Control Processing tx (text) |
M | Mark types ::=
|
R | Retinal properties ::=
|
XYZT | Position in space time ::= N, O, Q,
* (non-semantic use of space-time) |
V | View transformation ::=hb(hyperbolic mapping) |
W | Widget ::= sl(slider) rb(radio buttons) |
Scientific visualization generally starts from data whose variables are intrinsically spatial.
Variable | D | F | D' | CP | M | R | X | Y | Z | T | V | W |
Samples | N | P | ||||||||||
Ozone | Q | f | O | C | ||||||||
Lon. | Qlon | Q | ||||||||||
Lat. | Qlat | Q | ||||||||||
Height | Q | Q | ||||||||||
Date | Q | Q |
The rows of the table describe the variables with the case variable (“Samples”) at the top and the value variables below.
The nominal (N) set of Samples is mapped to point marks (P in column M), which have their retinal property of color (C in column R) mapped to the Ozone variable.
The ozone mapping includes a function (f) that converts the quantitative (Q) ozone measurements to an ordinal (O) set that can be easily mapped to a set of colors.
The quantitative (Q) variables of Longitude, Latitude, and Height are mapped to the positions X, Y, and Z, which determine the position of the point marks. The Date variable is mapped to time (T), which creates an animated visualization.
Table 1 makes it clear that Figure 1 is a 3D animated visualization involving colored points.
The Offices variable is mapped to line marks (L).
The Profit variable is mapped to the size of these lines (Sz in the R column). Profits are also mapped to the Z-axis and via a function (f) to a nominal set indicating the sign of the profits. This nominal set is mapped to the color of the lines (C in the R column). Table 2 clearly reveals that multiple graphical techniques are used to describe the Profit variable in order to enhance the perception of this important data variable.
Multi-dimensional plots take variables that are not intrinsically spatial and map them onto X and Y, e.g.,
Q --> X,
Q --> Y.
When point marks are positioned on these axes, the result is the conventional scatterplot that is often used in statistical graphics.
Landscapes lay information out on a surface, typically the XY plane. Landscapes may be of several sorts: real geographical coordinates, real spatial variables, or completely abstract mappings
{Qlon or QX, or Q} --> X
{Qlat or QY or Q} --> Y.
If the mapping extends to
Q--> Z, we call it an information space.
Node and link diagrams allow the encoding of linkage information between entities. They can be thought of as a mapping from a Nominal set to itself {NxN}. These are then mapped into XY.
Trees can also be visualized as nested enclosures. Shneiderman and colleagues [16] have done a space-filling form of enclosure tree called Tree-Maps. At one level in a tree, the children of a node divide up the X dimension of the visualization, at the next level they divide up the Y dimension of the node in which they are enclosed. The division proceeds alternating between X and Y until the leaves of the tree are reached. This method uses all of the space.
In this paper we have sketched part of a scheme for mapping the morphology of the design space of visualizations.
Two levels of analysis not addressed in this short paper are the larger organizational structure of information spaces and the organization of user tasks.
With respect to the larger organizational structure, we have previously suggested in the text area an analysis into information space, workspace, sensemaking tools, and documents and surveyed systems in each of these areas [20].
For user’s tasks, we have previously suggested notions of “knowledge crystallization”, comprising in part “information foraging” [21] and “sensemaking”[22].
Besides helping to organize the literature, our present analysis suggests regions of new visualizations because it concentrates on the mappings between data and presentation.
The table notation, in particular, organizes these mappings in a way that reveals when a data set is mapped to a graphical property in isolation, with overloading, or via distortion.
The key issue for effective visualization is that users must be able to invert this mapping and perceive the data in the visualization.
The nominal (N) set of Samples is mapped to point marks (P in column M), which have their retinal property of color (C in column R) mapped to the Ozone variable.
The ozone mapping includes a function (f) that converts the quantitative (Q) ozone measurements to an ordinal (O) set that can be easily mapped to a set of colors.
The quantitative (Q) variables of Longitude, Latitude, and Height are mapped to the positions X, Y, and Z, which determine the position of the point marks. The Date variable is mapped to time (T), which creates an animated visualization.
Table 1 makes it clear that Figure 1 is a 3D animated visualization involving colored points.
Variable | D | F | D' | CP | M | R | X | Y | Z | T | V | W |
Office | L | |||||||||||
Lon. | Qlon | Q | ||||||||||
Lat. | Qlat | Q | ||||||||||
Profit | Q | Sz | Q | |||||||||
f | N | C |
The Offices variable is mapped to line marks (L).
The Profit variable is mapped to the size of these lines (Sz in the R column). Profits are also mapped to the Z-axis and via a function (f) to a nominal set indicating the sign of the profits. This nominal set is mapped to the color of the lines (C in the R column). Table 2 clearly reveals that multiple graphical techniques are used to describe the Profit variable in order to enhance the perception of this important data variable.
Multi-dimensional plots take variables that are not intrinsically spatial and map them onto X and Y, e.g.,
Q --> X,
Q --> Y.
When point marks are positioned on these axes, the result is the conventional scatterplot that is often used in statistical graphics.
Landscapes lay information out on a surface, typically the XY plane. Landscapes may be of several sorts: real geographical coordinates, real spatial variables, or completely abstract mappings
{Qlon or QX, or Q} --> X
{Qlat or QY or Q} --> Y.
If the mapping extends to
Q--> Z, we call it an information space.
Node and link diagrams allow the encoding of linkage information between entities. They can be thought of as a mapping from a Nominal set to itself {NxN}. These are then mapped into XY.
Trees can also be visualized as nested enclosures. Shneiderman and colleagues [16] have done a space-filling form of enclosure tree called Tree-Maps. At one level in a tree, the children of a node divide up the X dimension of the visualization, at the next level they divide up the Y dimension of the node in which they are enclosed. The division proceeds alternating between X and Y until the leaves of the tree are reached. This method uses all of the space.
In this paper we have sketched part of a scheme for mapping the morphology of the design space of visualizations.
Two levels of analysis not addressed in this short paper are the larger organizational structure of information spaces and the organization of user tasks.
With respect to the larger organizational structure, we have previously suggested in the text area an analysis into information space, workspace, sensemaking tools, and documents and surveyed systems in each of these areas [20].
For user’s tasks, we have previously suggested notions of “knowledge crystallization”, comprising in part “information foraging” [21] and “sensemaking”[22].
Besides helping to organize the literature, our present analysis suggests regions of new visualizations because it concentrates on the mappings between data and presentation.
The table notation, in particular, organizes these mappings in a way that reveals when a data set is mapped to a graphical property in isolation, with overloading, or via distortion.
The key issue for effective visualization is that users must be able to invert this mapping and perceive the data in the visualization.
沒有留言:
張貼留言