2016年1月2日 星期六

Borkin, M., Vo, A., Bylinskii, Z., Isola, P., Sunkavalli, S., Oliva, A., & Pfister, H. (2013). What makes a visualization memorable?. Visualization and Computer Graphics, IEEE Transactions on, 19(12), 2306-2315.

Borkin, M., Vo, A., Bylinskii, Z., Isola, P., Sunkavalli, S., Oliva, A., & Pfister, H. (2013). What makes a visualization memorable?. Visualization and Computer Graphics, IEEE Transactions on19(12), 2306-2315.

過去許多視覺化專家認為視覺化的結果不應該包括「圖表上的廢話」(chart junk),而應該越清楚地顯示資料越好,例如Edward Tufte 與Stephen Few[13, 14, 37, 38],心理學的實驗結果也證實了簡單而清楚的視覺化較容易閱讀[11, 24]。但近來也有一些研究者發表「圖表上的廢話」也許能夠增進閱讀者的持久力,讓他們更能努力在了解圖表上,增加對資料的認識與知識[4, 8, 19]。Bateman et al. [4] 的研究指出有裝飾的圖表比素樸的圖表有較佳的記憶,並且在理解上也不會較差,神經生物學則有加上視覺困難 (visual difficulties) 可以增強觀看者理解的假說[8, 19]。除了「圖表上的廢話」之外,圖表的類型、顏色與其他美學因素也會影響人們的觀看、解讀與記憶。本研究利用410張具有圖表的媒體,對261位受試者進行圖表記憶的研究,探討圖表是否與自然景象的影像[20]一樣在不同人之間具有一致性,並且測試什麼樣的視覺化類型與屬性具有較好的可記憶性。

為了進行本研究對各種圖表類型與屬性是否影響圖表的可記憶性,本研究首先參考前人的研究制訂視覺化的分類架構 (taxonomy)。過去的分類架構有根據圖形的知覺模型 (graphical perception models)、視覺與組織配置 (visual and organizational layout) 以及圖形資料的編碼 (graphical data encodings) [6, 12, 35, 33],或是根據視覺化的演算法 [32, 36] ,近年也有以互動型視覺化和其提供的任務作為分類架構 [16, 17, 33, 35]。本研究的分類架構以Harris [15]提出的詞彙 (vocabulary) 為主,強調圖形表示的語法結構與資訊類型 (Englehardt [12]) 以及人類對圖形的知覺 (Cleveland and McGill [11]),將視覺化分為12類,每一類又分為若干次分類,如下表所示:


在分類架構上,另外對這些圖表附加上若干性質(properties)與屬性(attributes),性質有維度數目 (2D或3D)、多重性 (單一、群體、多板塊、組合)、插圖(pictorial)以及時間序列,屬性尚包括是否是黑白、顏色的數目、資料與非資料間的比例 (data-ink ratio)、視覺元素面積佔整個影像的比例、是否有插圖以及是否有人像等。

不同的視覺化影像來源中,科學出版品 (scientific publications) 和資訊圖表 (infographics) 在視覺化上具有的多重性非當高,顯示這兩種影像來源的製作上往往需要組合多種圖表來表示概念和理論,而另一個考量則在於節省篇幅;反之,政府與組織出版品則大多是單一視覺化。

以視覺化的類型而言,為了解釋文章中的概念或是理論的結果,科學出版品有極高的比例使用diagram,此外科學文章中也大量使用基本的視覺編碼技術(visual encoding techniques),諸如折線圖、長條圖和散布圖,某些領域內會使用特定的視覺編碼,例如網格與矩陣圖 (grid and matrix plots)以及樹狀與網路圖 (tree and networks)。因此,相較於其他類型的來源,科學出版品中較'常使用樹狀與網路圖、網格與矩陣圖和散布圖。

資訊圖表中也使用大量的diagrams,主要包括流程圖(flow charts)和時間表(timelines),此外表格(table)也經常被使用在資訊圖表中,這些表格通常會加上插畫的裝飾。但是資訊圖表相較於其他來源,較少使用折線圖。

新聞媒體和政府出版品主要使用長條圖、折線圖地圖和表格,折線圖通常用來表示時間序列的資料。兩種來源的差異,是政府的報告會大量使用圓形圖,例如圓餅圖。

本研究在評估圖表具有可記憶性的高低時,使用命中率 (hit rate, HR) 與誤警率 (false alarm rate, FAR),並且用敏感指數(sensitivity index) d-prime metric整合兩個數值進行排序, d' = Z(HR) − Z(FAR) ,Z 是高斯分布(Gaussian distribution)的累積分配函數(cumulative distribution function, CDF)的倒數(inverse)。愈高的d'值表示HR較高而FAR較低,此時該視覺化圖形較容易被記憶,反之則否。本研究得到視覺化圖形的可記憶性平均測量結果為HR = 55.36%和SD = 16.51%以及FAR = 13.17%, SD = 10.73%。相較於先前的圖像可記憶性的研究結果,較自然景象的可記憶性 (HR = 67.5%, SD = 13.6%以及FAR = 10.7%, SD = 7.6%) [21] 差,但與人臉的可記憶性 (HR = 53.6%, SD = 14.3%以及FAR = 14.5%, SD = 9.9%) [2]差不多。而將本研究的測試結果隨機分為兩群,經過25次的測量,HR、FAR和d-prime在兩群間的結量結果的Spearman等級相關係數分別為0.83、0.78和0.81。由此可知,視覺化圖形在可記憶性的不同受試者測量上也有一致性,可記憶性是視覺化圖形的一種特性。

以下針對視覺化圖形的種類、性質與屬性進行可記憶性的探討;

具有插畫的視覺化圖形明顯比沒有插畫的視覺化圖形有較高的可記憶性,

愈多顏色的視覺化圖形的可記憶性愈佳,甚至在沒有插畫的圖形上,較多顏色的圖形也比只有一種顏色的圖形可記憶性來得高。

視覺元素面積佔整個影像的比例較高的視覺化圖形也比較容易被記憶。

資料與非資料間的比例較低的視覺化圖形的可記憶性較佳。

diagram、網格與矩陣圖以及樹狀與網路圖等類型較具有獨特(unique)呈現的圖形比長條圖、折線圖、表格等較一般(common)呈現的圖形較具有可記憶性,特別是圖形上沒有插圖時,因為這類圖形的呈現相近,彼此干擾,使得命中率較低而誤警率較高。

在各種來源的視覺化圖形裡,資訊圖表類具有最高的可記憶性,其次是科學出版品,最不具可記憶性的圖形來源則是政府與世界組織類。

另外,具有圓形或圓的邊的圖形的可記憶性也較高,作者認為插畫與圓形都是較自然的視覺化作品,因此較容易被記憶。


We ran the largest scale visualization study to date using 2,070 single-panel visualizations, categorized with visualization type (e.g., bar chart, line graph, etc.), collected from news media sites, government reports, scientific journals, and infographic sources. Each visualization was annotated with additional attributes, including ratings for data-ink ratios and visual densities.

Altogether our findings suggest that quantifying memorability is a general metric of the utility of information, an essential step towards determining how to design effective visualizations.

The conventional view, promoted by visualization experts such as Edward Tufte and Stephen Few, holds that visualizations should not include chart junk and should show the data as clearly as possible without any distractors [13, 14, 37, 38]. This view has also been supported by psychology lab studies, which show that simple and clear visualizations are easier to understand [11, 24].

At the other end of the spectrum, researchers have published that chart junk can possibly improve retention and force a viewer to expend more cognitive effort to understand the graph, thus increasing their knowledge and understanding of the data [4, 8, 19].

What researchers agree on is that chart junk is not the only factor that influences how a person sees, interprets, and remembers a visualization. Other aspects of the visualization, such as graph type, color, or aesthetics, also influence a visualization’s cognitive workload and retention [8, 19, 39].

Recent work has shown that memorability of images of natural scenes is consistent across people, suggesting that some images are intrinsically more memorable than others, independent of an individual’s contexts and biases [20].

This is because given limited cognitive resources and time to process novel information, capitalizing on memorable displays is an effective strategy.

Recent large-scale visual memory work has shown that existing categorical knowledge supports memorability for item-specific details [9, 22, 23]. In other words, many additional visual details of the image come for free when retrieving memorable items.

For our research, we first built a new broad taxonomy of static visualizations that covers the large variety of visualizations used across social and scientific domains. These visualization types range from area charts, bar charts, line graphs, and maps to diagrams, point plots, and tables.

We then used these 2,070 visualizations in an online memorability study we launched via Amazon’s Mechanical Turk with 261 participants. This study allowed us to gather memorability scores for hundreds of these visualizations, and determine which visualization types and attributes were more memorable.

Perception Theory and the Chart Junk Debate:

Bateman et al. conducted a study to test the comprehension and recall of graphs using an embellished version and a plain version of each graph [4]. They showed that the embellished graphs outperformed the plain graphs with respect to recall, and the embellished versions were no less effective for comprehension than the plain versions.

There has been some support for the comprehension results from a neurobiological standpoint, as it has been hypothesized that adding “visual difficulties” may enhance comprehension by a viewer [8, 19].

Other studies have shown that the effects of stylistic choices and visual metaphors may not have such a significant effect on perception and comprehension [7, 39].

In response to the Bateman study, Stephen Few wrote a comprehensive critique of their methodology [14], most of which also applies to other studies. A number of these studies were conducted with a limited number of participants and target visualizations. Moreover, in some studies the visualization targets were designed by the experimenters, introducing inherent biases and over-simplifications [4, 8, 39].

Visualization Taxonomies:

Within the academic visualization community there have been many approaches to creating visualization taxonomies.

Traditionally many visualization taxonomies have been based on graphical perception models, the visual and organizational layout, as well as the graphical data encodings [6, 12, 35, 33].

Another approach to visualization taxonomies is based on the underlying algorithms of the visualization and not the data itself [32, 36].

There is also recent work on taxonomies for interactive visualizations and the additional tasks they enable [16, 17, 33, 35].

Outside of the academic community there is a thriving interest in visualization collections for the general public. For example, the Periodic Table for Management [26] present a classification of visualizations with a multitude of illustrated diagrams for business. The online community Visualizing.org introduces an eight-category taxonomy to organize the projects hosted on their site [25]. InfoDesignPatterns.com classifies visualization design patterns based upon visual representation and user interaction [5].

Cognitive Psychology:

These studies have demonstrated that the differences in the memorability of different images are consistent across observers, which implies that memorability is an intrinsic property of an image [21, 20].

Brady et al. [9] tested the long-term memory capacity for storing details by detecting repeat object images when shown pairs of objects, one old and one new. They found that participants were accurate in detecting repeats with minimal false alarms, indicating that human visual memory has a higher storage capacity for minute details than was previously thought.

More recently, Isola et al. have annotated natural images with attributes, measured memorability, and performed feature selection, showing that certain features are good indicators of memorability [20, 21]. Memorability was measured by launching a “Memory Game” on Amazon Mechanical Turk, in which participants were presented with a sequence of images and instructed to press a key when they saw a repeat image in the sequence. The results showed that there was consistency across the different participants, and that people and human-scale objects in the images contribute positively to the memorability of scenes. That work also showed that unusual layouts and aesthetic beauty were not overall associated with high memorability across a dataset of everyday photos [20].


The taxonomy classifies static visualizations according to the underlying data structures, the visual encoding of the data, and the perceptual tasks enabled by these encodings.

It contains twelve main visualization categories and several popular sub-types for each category. In addition, we supply a set of properties that aid in the characterization of the visualizations.

This taxonomy draws from the comprehensive vocabulary of information graphics presented in Harris [15], the emphasis on syntactic structure and information type in graphic representation by Englehardt [12], and the results of Cleveland and McGill in understanding human graphical perception [11].



Dimension represents the number of dimensions (i.e., 2D or 3D) of the visual encoding.

Multiplicity defines whether the visualization is stand-alone (single) or somehow grouped with other visualizations (multiple). We distinguish several cases of multiple visualizations. Grouped means multiple overlapping/superimposed visualizations, such as grouped bar charts; multi-panel indicates a graphic that contains multiple related visualizations as part of a single narrative; and combination indicates a graph with two or more superimposed visualization categories (e.g., a line plot over a bar graph).

The pictorial property indicates that the encoding is a pictogram (e.g., a pictorial bar chart). Pictorial unit means that the individual pictograms represent units of data, such as the Istotype (International System of Typographic Picture Education), a form of infographics based on pictograms developed by Otto Neurath at the turn of the 19th century [29].

Finally, time is included, specifically as a time series, as it is such a common feature of visualizations and dictates specific visual encoding aspects regarding data encoding and ordering.

The first two attributes, black & white and number of distinct colors give a general sense of the amount of color in a visualization.

A measure of chart junk and minimalism is encapsulated in Edward Tufte’s data-ink ratio metric [37], which approximates the ratio of data to non-data elements.

The visual density rates the overall density of visual elements in the image without distinguishing between data and non-data elements.

Finally, we have two binary attributes to identify pictograms, photos, or logos: human recognizable objects and human depiction. We explicitly chose to have a separate category for human depictions due to prior research indicating that the presence of a human in a photo has a strong effect on memorability [21].



There is also a very high percentage of multiple visualizations in the scientific publication category. There are two primary explanations for this observation. First, like infographics, multiple individual visualizations are combined in a single figure in order to visually explain scientific concepts or theories to the journal readers. Second, combining visualizations into a single figure (even if possibly not directly related) saves page count and money.

In contrast, a very high ratio of single visualizations is seen in government / world organizations. These visualizations are usually published one-at-a-time within government reports, and there are no page limits or space issues as with scientific journals.



Scientific publications, for example, have a large percentage of diagrams. These diagrams are primarily used to explain concepts from the article, or illustrate the results or theories. Also included are renderings (e.g., 3D molecular diagrams). The scientific articles also use many basic visual encoding techniques, such as line graphs, bar charts, and point plots. Domain-specific uses of certain visual encodings are evident, e.g., grid and matrix plots for biological heat maps, trees and networks for phylogenic trees, etc.

Infographics also use a large percentage of diagrams. These diagrams primarily include flow charts and timelines. Also included in infographics is a large percentage of tables. These are commonly simple tables or ranked lists that are elaborately decorated and annotated with illustrations. Unlike the other categories, there is little use of line graphs.

In contrast to the scientific and infographic sources, the news media and government sources publish a more focused range of topics, thus employing similar visualization strategies. Both sources primarily rely on bar charts and other “common” (i.e., learnt in primary school) forms of visual encodings such as line graphs, maps, and tables. The line graphs are most commonly time series, e.g., of financial data. One of the interesting differences between the categories include the greater use of circle plots (e.g., pie charts) in government reports.

Looking at specific visualization categories, tree and network diagrams only appear in scientific and infographic publications. This is probably due to the fact that the other publication venues do not publish data that is best represented as trees or networks. Similarly, grid and matrix plots are primarily used to encode appropriate data in the scientific context. Interestingly, point plots are also primarily used in scientific publications. This may be due to either the fact that the data being visualized are indeed best visualized as point plot representations, or it could be due to domain-specific visualization conventions, e.g., in statistics.

Worth noting is the absence of text visualizations from almost all publication venues. The only examples of text based visualizations were observed in the news media. Their absence may be explained by the fact that their data, i.e., text, is not relevant to the topics published by most sources. Another possible explanation is that text visualizations are not as “main stream” in any of the visualization sources we examined as compared to other visualization types.

Performance Metrics:

Workers saw each target image at most 2 times (less than twice if they prematurely exited the game). We measure an image’s hit rate (HR) as the proportion of times workers responded on the second (repeat) presentation of the image. In signal detection terms: HR = HITS/(HITS+MISSES).

We also measured how many times workers responded on the first presentation of the image. This corresponds to workers thinking they have seen the image before, even though they have not. This false alarm rate (FAR) is calculated: FAR = FA/(FA+CR) , where FA is the number of false alarms and CR is the number of correct rejections (the absense of a response).

For performing a relative sorting of our data instances we used the d-prime metric (otherwise called the sensitivity index). This is a common metric used in signal detection theory, which takes into account both the signal and noise of a data source, calculated as: d' = Z(HR) − Z(FAR) (where Z is the inverse cumulative Gaussian distribution). A higher d' corresponds to a signal being more readily detected. Thus, we can use this as a memorability score for our visualizations. A high score will require the HR to be high and the FAR to be low. This will ensure that visualizations that are easily confused for others (high FAR) will have a lower memorability score.

In other words, we have measured how visualizations would be remembered if they were images. We observed a mean HR of 55.36% (SD = 16.51%) and mean FAR of 13.17% (SD = 10.73%).

For comparison, scene memorability has a mean HR of 67.5% (SD = 13.6%) with mean FAR of 10.7% (SD = 7.6%) [21], and face memorability has a mean HR of 53.6% (SD = 14.3%) with mean FAR of 14.5%(SD = 9.9%) [2]. This possibly supports our first hypothesis that visualizations are less memorable than natural scenes.

This demonstrates that there is memorability consistency with scenes, faces, and also visualizations, thus memorability is a generic principle with possibly similar generic, abstract, features.

We also measured the consistency of our memorability scores [2, 21]. By splitting the participants into two independent groups, we can measure how well the memorability scores of one group on all the target images compare to the scores of another group (Fig. 3).

Averaging over 25 such random half-splits, we obtain Spearman’s rank correlations of 0.83 for HR, 0.78 for FAR, and 0.81 for d-prime, the latter of which is plotted in Fig. 3.

This high correlation demonstrates that the memorability of a visualization is a consistent measure across participants, and indicates real differences in memorability between visualizations.

In other words, despite the noise introduced by worker variability and by showing different image sequences to different workers, we can nevertheless show that memorability is somehow intrinsic to the visualizations.

Of our 410 target visualizations, 145 contained either photographs, cartoons, or other pictograms of human recognizable objects (from here on out referred to broadly as “pictograms”). Visualizations containing pictograms have on average a higher memorability score (Mean (M)=1.93) than visualizations without pictograms (M = 1.14,t(297) = 13.67, p < 0.001). This supports our second hypothesis.

Thus not all chart junk is created equal: annotations and representations containing pictograms are across the board more memorable.

Thus an image, or image of a visualization, containing a human recognizable object will be easily recognizable and probably memorable. Due to this strong main effect of pictograms, we examined our results for both the cases of visualizations with and without pictograms. As shown in the left-most panel of Fig. 1, all but one of the overall top most memorable images (as ranked by their d-prime scores) contain human recognizable pictograms.

As shown in Fig. 4, there is an observable trend of more colorful visualizations having a higher memorability score: visualizations with 7 or more colors have a higher memorability score (M = 1.71) than visualizations with 2-6 colors (M = 1.48,t(285) = 3.97, p < 0.001), and even more than visualizations with 1 color or black-and-white gradient (M = 1.18,t(220) = 6.38, p < 0.001).

When we remove visualizations with pictograms, the difference between visualizations with 7 or more colors (M = 1.34) and those that have only 1 color (M = 1.00) remains statistically significant (t(71) = 3.61, p < 0.001).

Considering all the visualizations together, we observed a statistically significant effect of visual density on memorability scores with a high visual density rating of “3” (M = 1.83), i.e., very dense, being greater than a low visual density rating of “1” (M = 1.28,t(115) = 6.08, p < 0.001) as shown in Fig. 5.

We also observed a statistically significant effect of the data-to-ink ratio attribute on memorability scores with a “bad” (M = 1.81), i.e., low data-to-ink ratio, being higher than a “good” rating (M = 1.23,t(208) = 6.92, p < 0.001) as shown in Fig. 6. Note that using a corrected t-test, we also arrive at the results that the 3 levels of data-ink ratio are pairwise significantly different from each-other .

Summarizing all of these attribute results: higher memorability scores were correlated with visualizations containing pictograms, more color, low data-to-ink ratios, and high visual densities.

As shown in Fig. 7, diagrams were statistically more memorable than points, bars, lines, and tables. These trends remain observable even when visualizations with pictograms are removed from the data. Other than some minor ranking differences and addition of the map category, the main difference is in the ranking of the table visualization type, which without pictograms becomes least memorable.

The middle panel of Fig. 1 displays the most memorable visualizations that do not contain pictograms. Why are these visualizations more memorable than the ones in the right-most panel?

To start with, qualitatively viewing the most memorable visualizations, most are high contrast. These images also all have more color, a trend quantitatively demonstrated in Sec. 7.2 to be correlated with higher memorability. As compared to the more subdued less memorable visualizations, the more memorable visualizations are easier to see and discriminate as images.

Another possible explanation is that “unique” types of visualizations, such as diagrams, are more memorable than “common” types of visualizations, such as bar charts. This trend is also evident in Fig. 7 in which grid/matrix, trees and networks, and diagrams have the highest memorability scores.

Examples of these unique types of visualizations are each individual and unique, whereas bar charts and line graphs are uniform with limited variability in their visual encoding methodology. Previously it has been shown that an item is more likely to interfere with another item if it has similar category or subordinate category information, but unique exemplars of objects can be encoded in memory quite well [22]. This supports our findings that show high FAR and low HR for table and bar visualizations, which both have very similar visuals within their category (i.e., all the bar charts look alike).

Another possible explanation is that visualizations like bar and line graphs are just not natural. If image memorability is correlated with the ability to recognize natural, or natural looking, objects then people may see diagrams, radial plots, or heat maps as looking more “natural”.

One common visual aspect of the most memorable visualizations is the prevalence of circles and round edges. Previous work has demonstrated that people’s emotions are more positive toward rounded corners than sharp corners [3]. This could possibly support both the trend of circular features in the memorable images as well as the concept of natural-looking visualizations being more memorable since “natural” things tend to be round.

As shown in Fig. 8, regardless of whether the visualizations did or did not include pictograms, the visualization source with the highest memorability score was the infographic category (M = 1.99,t(147) = 5.96, p < 0.001 when compared to the next highest category, scientific publications with M = 1.48), and the visualization source with the lowest memorability score was the government and world organizations category (M = 0.86,t(220) = 8.46, p < 0.001 when compared to the next lowest category, news media with M = 1.46).

This may be a contributing factor to the observed trend (see Fig. 8) that visualization sources that have non-uniform aesthetics tend to have higher memorability scores than sources with uniform aesthetics. This observation refutes our last hypothesis that visualizations from scientific publications are less memorable. This may also be due to the fact that visualizations in scientific publications have a high percentage of diagrams (Fig. 2), similar to the infographic category.

The results of our memorability experiment show that visualizations are intrinsically memorable with consistency across people.

They are less memorable than natural scenes, but similar to images of faces, which may hint at generic, abstract, features of human memory.

Not surprisingly, attributes such as color and the inclusion of a human recognizable object enhance memorability. And similar to previous studies we found that visualizations with low data-to-ink ratios and high visual densities (i.e., more chart junk and “clutter”) were more memorable than minimal, “clean” visualizations.

It appears that we are best at remembering “natural” looking visualizations, as they are similar to scenes, objects, and people, and that pictorial and rounded features help memorability.

More surprisingly, we found that unique visualization types (pictoral, grid/matrix, trees and networks, and diagrams) had significantly higher memorability scores than common graphs (circles, area, points, bars, and lines).

沒有留言:

張貼留言