Chen, M., Ebert, D., Hagen, H., Laramee, R. S., Van Liere, R., Ma, K. L., ... & Silver, D. (2009). Data, information, and knowledge in visualization. Computer Graphics and Applications, IEEE, 29(1), 12-19.
information visualization
本研究從視覺化處理過程的觀點區分資料、資訊和知識,並且審視資訊和知識目前在視覺化科技發展上的作用,並建議應了解從資料轉化為資訊與知識的過程以及運用來強化未來的視覺化系統。作者利用兩種表示方式代表資料、資訊和知識: P 代表所有人類外顯與內隱的記憶,例如Pdata、Pinfo和 Pknow 分別代表有關資料、資訊和知識的人類記憶,並且Pdata⊂P, Pinfo⊂P, and Pknow⊂P; C 代表所有電腦記憶的表示形式,同樣以 Cdata、Cinfo和 Cknow來表示有關資料、資訊和知識的電腦記憶。根據上述的表示方法,當人類從電腦資料(Cdata⊂Cdata)中取得充分的資訊(Pinfo⊂Pinfo)與知識(Pknow⊂Pknow)而感到困難時,便需要進行資訊視覺化。典型的視覺化程序,如Figure. 1,將電腦資料Cdata經由視覺化技術處理轉化成圖像化的資料Cimage,便於有效能與有效率地獲取資訊(Pinfo)與知識(Pknow)。Figure 1圖上的表示控制資料Cctrl,包括使用者選擇用來探索資料的視覺化工具、呈現的樣式(style)、配置(layout)、觀看位置(viewing position)、顏色對應(color maps)與轉換(transfer)等功能,使用者可以透過這些控制資料將電腦資料轉換成他滿意的影像資料Cimage。
根據這樣的概念,資訊視覺化乃是一種參數空間(parameter space)相當大的搜尋程序,並且由於分析的資料量愈來愈大以及愈來愈多的視覺化技術,造成視覺化搜尋的參數空間更加擴大。因此,利用資訊輔助(information-assisted)的視覺化系統被提出來,提供輸入資訊的相關資訊、視覺化程序與結果的屬性以及使用者知覺行為的特性,使用者能夠使用這些資訊來縮減控制參數的搜尋空間,使得互動更加具有效能。Figure 2表示利用資訊輔助的視覺化系統的概念。
使用者的知識是視覺化的過程中不可或缺的一個部分,知識輔助視覺化蒐集專家使用者的知識,學習最佳實務(the best practice)並且將這些知識模式化(model),發展與改進視覺化的架構,其目的即是包含不同使用者的領域知識並且降低使用者需要複雜技巧的負擔。Figure 3上的知識輔助視覺化系統便是使用規則式推論(rule-based reasoning)建立適合的控制參數集合來減少搜尋空間,然而這類系統的問題在於蒐集與完整的表達專家知識並不容易。
Figure 5則是利用案例式推論(case-based reasoning)的方式蒐集、處理與分析視覺化過程上的資料,從案例的成功與失敗、資料與控制參數之間的關連以及其他有關視覺化任務、工具和使用者的模式推算常用的方法與參數、最佳實務與最佳化策略等知識。
Researchers have attempted to clarify the taxonomy of terms used in the visualization community
(for example, in the work of Ed H. Chi [4], Ben Shneiderman, [5], and Melanie Tory and Torsten Möller [6]). However, the terms data, information, and knowledge remain ambiguous.
This article doesn't attempt to offer a different taxonomy for visualization. Instead, we differentiate these three terms from the perspective of visualization processes. Furthermore, we examine the current and future role of information and knowledge in the development of visualization technology.
Let P be the set of all possible explicit and implicit human memory. The former encompasses the memory of events, facts, and concepts, and the understanding of their meanings, context, and associations. The latter encompasses all non-conscious forms of memory, such as emotional responses, skills, and habits. [9] We can thus focus on three subsets of memory, Pdata⊂P, Pinfo⊂P, and Pknow⊂P, where Pdata, Pinfo, and Pknow are the sets of all possible explicit and implicit memory;about data, information, and knowledge, respectively.
Let C be the set of all possible representations in computer memory. Similarly, we can consider three subsets of representations, Cdata, Cinfo, and Cknow. ... A computer representation of visualization is also a form of visual data.
Figure 1 shows a typical visualization process, illustrating instances of data, information, and knowledge in both computational space and perceptual and cognitive space. Hence, the need for visualization is based on the difficulties humans face in acquiring a sufficient amount of information (Pinfo⊂Pinfo) or knowledge (Pknow⊂Pknow) directly from a data set (Cdata⊂Cdata). The process of creating visualization is a function that maps from Cdata to the set of all imagery data, Cimage. It transforms a data set Cdata to a visual representation Cimage, which facilitates a more efficient and effective cognitive process for acquiring Pinfo and Pknow.
Given a data set Cdata, a user first makes decisions about which visualization tools to use for exploring the data set. The user then experiments with different controls, such as styles, layout, viewing position,
color maps, and transfer functions, until he or she obtains a satisfactory collection of visualization results, Cimage.
Depending on the visualization tasks, satisfaction can come in many forms. For example, the user may have obtained sufficient information or knowledge about the data set, or may have obtained the most appropriate illustration about the data to assist others in the knowledge acquisition process.
Such a visualization process is fundamentally the same as a typical search process, except that it is usually much more complex than plugging a few keywords into a search engine. In visualization,
the tools for the “search” tasks are usually application-specific (for example, network, flow, volume visualization). The parameter space for the “search” is normally huge (for example, exploring many viewing positions or trying out many different transfer functions). The user interaction for the “search” sometimes can be very slow, especially in handling very large data sets.
However, with the growing amount of data and increasing availability of different visualization techniques, the search space for a visualization process is also expanding. Like the Internet search problem, interactive visualization alone is no longer adequate.
Figure 2 illustrates an information-assisted visualization process. Some techniques use information captured in the visualization process to improve visualization efficiency and effectiveness.
In information-assisted visualization, the system provides the user with a second visualization pipeline (see Figure 2), which typically displays the information about the input data set. But it can also present attributes of the visualization process, the properties of the results, or characteristics of the user’s perceptual behaviors. The user uses such information to reduce the search space for optimal control parameters, hence making the interaction much more cost effective.
In a visualization process, the user’s knowledge is an indispensable part of visualization. ... Meanwhile, the lack of certain user knowledge is often a major obstacle in deploying visualization techniques. The user might not have received adequate training about how to specify transfer functions,
or might not have sufficient time or navigation skills to explore all possible viewing positions.
The objectives of knowledge-assisted visualization include sharing domain knowledge among different users and reducing the burden on users to acquire knowledge about complex visualization techniques. It also enables the visualization community to learn and model the best practice, so that powerful visualization infrastructures can develop and evolve.
If a visualization system could collect a large repository of such knowledge, it could then choose an appropriate transfer function based on the attributes of an input data set.
Figure 3 (page 18) shows a visualization pipeline supported by a knowledge base (Cknow), that stores knowledge representations captured from expert users. The system can use rule-based reasoning to establish an appropriate set, or several optional sets, of control parameters that can significantly reduce the search space, especially for inexperienced users. The system component for reasoning is commonly called an inference engine in knowledge-based systems (or expert systems).
The shortcomings of such a system include the difficulties in specifying comprehensively what knowledge to capture and the inconvenience of collecting knowledge from experts. This constrains the deployment of such a system to specific application domains.
An alternative approach is to establish a visualization infrastructure, where the system can systematically collect, process, and analyze data about visualization processes. Using case-based reasoning, it can infer knowledge from cases of successes and failures, common associations between data sets and control parameters, and many other patterns exhibited by visualization tasks, tools, users, and interactions. Such knowledge might include a popular approach, commonly used parameter sets, the best practice, an optimization strategy, and so forth.
Such an infrastructure is general purpose and can support multiple application domains. It can potentially enable applications to benefit from the best practice as well as software developed for other applications.
As a discipline, visualization has thrived on helping application users transfer data (Cdata) in
the computational space to information (Pinfo) and knowledge (Pknow) in the perceptual and cognitive space. As a discipline, we need infrastructures to collect data about visualization processes and to transfer this data to information and knowledge to further our understanding and enhance visualization technology.
沒有留言:
張貼留言