Communicating data: interactive infographics, scientific data and credibility

18/06/2018

Abstract: 

Information visualization could be used to leverage the credibility of displayed scientific data. However, little was known about how display characteristics interact with individuals' predispositions to affect perception of data credibility. Using an experiment with 517 participants, we tested perceptions of data credibility by manipulating data visualizations related to the issue of nuclear fuel cycle based on three characteristics: graph format, graph interactivity, and source attribution. Results showed that viewers tend to rely on preexisting levels of trust and peripheral cues, such as source attribution, to judge the credibility of shown data, whereas their comprehension level did not relate to perception of data credibility. We discussed the implications for science communicators and design professionals.


Commonly described as a “computer-supported, interactive visual representation of abstract data” [Card, Mackinlay and Shneiderman, 1999, p. 9], information visualization has undergone a surge in its number of applications to science communication in the past twenty years [Welles and Meirelles, 2014]. Innovative forms of data visualizations, ranging from simple proportional area charts showing global carbon footprints [e.g. Lavelle, 2013] to complex 3D animations representing the results of biomedical scanning [e.g. Animation World Network, 2015], have gained increasing popularity among the scientific community. Scientists, researchers, and data professionals have employed computational visualizations to reveal data patterns that are not discernible when presented in non-visual formats. Interactive visual representations are used to augment analytical reasoning processes, which empower audiences to explore visual data to obtain decision-supporting insights and knowledge [Fisher, Green and Arias-Hernández, 2011; Thomas and Cook, 2005]. More recently, the rise of data journalism has fueled interest in visual narratives in which an interactive visual plays a vital role in engaging a mass audience [Segel and Heer, 2010].

For science communicators, the potential utility of information visualization expands beyond visually representing a dataset or empowering calculative analysis. Information visualization and other forms of visual displays have been put forward as tools to facilitate public understanding of science and to mitigate the persistent influence of misinformation [e.g., Dixon et al., 2015; O’Neill and Smith, 2014]). For instance, visuals (e.g., pie charts) were shown to be more effective than text-only materials when conveying the scientific consensus on climate change to people with skeptical views [van der Linden, Clarke and Maibach, 2015; van der Linden et al., 2014]. In addition, individuals viewing visual exemplars accompanied with a textual description of the debunked MMR-autism linkage ended up having more accurate views than those reading two-sided information with no visuals [Dixon et al., 2015]. More importantly, people turned out to be less likely to disregard messages that threaten their beliefs or group identities if they were encouraged to make sense of the data through a visual display and if scientific credibility was leveraged in the process [Hall Jamieson and Hardy, 2014].

In spite of the growing interest in leveraging scientific credibility through visual techniques, little theory has considered the effects of visual characteristics, such as graph format and source attribution, on the perceived credibility of visualized data. To our knowledge, no studies examined how people assess the credibility of visually displayed data based on their predispositions, such as attitudes toward data source, numeracy skills, and self-perceived efficacy. With these considerations in mind, we intended to examine the effects of extrinsic factors, specifically visual format, interactivity, and source attribution, on lay audiences’ perception of data credibility. We also tested the relationship between perception of data credibility and individuals’ predispositions, comprehension, and evaluations of design quality.

1 Nuclear fuel cycle as a case study

To examine the aforementioned processes, we chose to use the issue of nuclear fuel cycle as a case study. The term “nuclear fuel cycle” refers to all activities involved in the production of nuclear energy, which typically includes uranium mining, enrichment, fuel fabrication, waste management and disposal. Depending on the specification (e.g., once-through or advanced cycles), nuclear fuel cycles can impose varying economic and environmental influences on local communities adjacent to nuclear facilities [see Wilson, 2011, for a review]. As of 2016, there were 99 nuclear reactors in 30 states in the United States, producing 19.7% of the total electrical output and 63% of carbon-free electricity [World Nuclear Association, 2018]. However, despite its reliance on nuclear energy, the U.S. government had not granted permission to construct any new reactor since 1977 until 2013, largely because of public fears resulting from the Three Mile Island accident in 1979 [World Nuclear Association, 2018]. Nonetheless, public opinion remained generally favorable toward nuclear energy after the 2011 Fukushima Daiichi nuclear accident in Japan, as 60% of Americans regarded nuclear power generation as “inevitable” [Kitada, 2016]. In addition, the tone of English-language tweets on nuclear energy had shifted from being predominantly pessimistic to neutral over the first nine months after the Fukushima accident [Li et al., 2016].

While public support did not drastically decline after Fukushima, local opposition to expanding nuclear energy never ceases. For instance, despite being a supplier of affordable power to the New York City and Westchester County, the Indian Point power plant was planned to shut as soon as 2021 due to the “serious risks posed to the surrounding communities and the environment” [Yee and McGeehan, 2017, para. 7]. Activists, local officials and concerned citizens were worried about the potential risks and used the Fukushima Daiichi nuclear accident to galvanize support for shutting down the Indian Point power plant.

Given the political controversy surrounding the domestic use of nuclear power, scientists and technical experts alike are obligated to demonstrate the risks and benefits of nuclear energy to concerned citizens and community leaders. In particular, to maximize the legitimacy of their policy decisions, policymakers and local officials would need to justify their decisions based on scientific data and an empirical comparison of the performance of different fuel cycle options [Li et al., 2018]. Indeed, there are a few simulation and visualization tools being developed nationwide with an aim to inform policymakers’ decisions [see Flanagan and Schneider, 2013, for an example]. The issue of nuclear fuel cycle hence presents an ideal context to test how the presentation format of scientific data might influence nontechnical audiences’ perception of data credibility. An empirical testing of the effectiveness of interactive visualizations will not only shed lights into the cognitive mechanism underlying people’s processing of such information, but also assist scientists with refining their visualization tools to achieve a better end.

Nevertheless, to avoid the potential confounding impact of individuals’ preexisting attitudes toward nuclear energy on their perception of data credibility, we accompanied the tested visuals with neutral and highly technical discourse, such as costs of “wet storage,” “dry storage,” “repository,” and “waste recycling.” Such discourse should prevent participants from linking a technical comparison of fuel cycle performance to societal debates of nuclear energy. To ensure the scientific validity of shown stimuli, we teamed up with nuclear scientists at a research university to develop visualizations as experimental stimuli.

2 Perception of data credibility

Data credibility is one of the most important components of data quality [Wang and Strong, 1996]. Individuals often evaluate data credibility based on their perceptions of characteristics such as accuracy and trustworthiness — an overarching category that includes aspects of currency, completeness, internal consistency, and subjectivity [Wang and Strong, 1996]. Not surprisingly, when people perceive a piece of information to be highly credible, they often develop positive attitudes toward its source [e.g. Hall Jamieson and Hardy, 2014]. In particular, information sources will be judged favorably “when identifiable characteristics of the source, content, delivery, and context prompt the conclusion that the communicator has expertise on the issue at hand and interests in common with the audience” [Hall Jamieson and Hardy, 2014, p. 13599]. For example, when people perceive a commercial website to be credible and informative, they are more likely to build a relationship with the organization who owns it [Lowry, Wilson and Haig, 2014]. The leveraged favorability resulting from evidence exposure might also minimize the likelihood that audiences will reject the conveyed message due to biased processing [Hall Jamieson and Hardy, 2014].

2.1 Visual format

The format of a visual display usually plays a dominant role in shaping viewers’ perceived credibility of the shown content. For instance, when evaluating the credibility of a website, people often mentioned the aesthetic appearance, such as layout, typography, images, and color themes [Fogg et al., 2003]. Designers commonly manipulate certain visual characteristics to give the appearance of credibility. For example, icons that look more dated usually imply longevity and stability and may increase perception of credibility for an old-fashioned company [Lowry, Wilson and Haig, 2014]. In contrast, logos with pieces of characters intentionally missing (termed “incomplete typeface logos”) reduced perceptions of brand trustworthiness for certain companies [Hagtvedt, 2011]. In other words, communicators can boost perception of credibility by incorporating visual cues that imply relevant concepts, such as sound experience or professionalism.

In a similar vein, researchers found the appearance of being “scientific” could increase message persuasiveness through elevating perceptions of credibility. Tal and Wansink [2016] randomized participants into two treatment groups, with one group reading a verbal narrative about a new medication that ostensibly enhances the immune system and reduces occurrences of common cold; the other group read the same message accompained by a bar graph showing the enclosed data. Compared to the control group, people who saw the bar graph were more likely to believe the medication was effective. As the authors argued, although the bar graph does not contain any new information, the visual can “signal a scientific basis for claims” that lead people to believe the message is scientifically legitimate and credible [Tal and Wansink, 2016, p. 7].

Similarly, other graphs commonly used to show descriptive statistics, such as line or area graphs, may also appear “scientific” and create a pseudo sense of trustworthiness among viewers. However, when viewing nontraditional forms of visualizations, such as proportional area graphs (also known as “bubble graphs”), people might be suspicious as they lack the scientific feel embedded in classic graphs. Our first hypothesis (H) addresses the relationship between visual format and perceived data credibility:

H1a: perceived credibility of visualized data is higher when data is presented in a traditional graph (e.g., area graph) than when it is presented in an innovative graph (e.g., proportional area graph).

2.2 Interactivity

Additionally, as digital technologies mature and further integrate with the Web, it becomes possible to include various levels of interactivity in information visualization. Prominent news organizations, including The New York Times, Washington Post, and Guardian, regularly incorporate interactive data visualizations into their news stories. By employing animation techniques, such as zooming, filtering, linking, and drill-down operations, users can freely explore visualized data and find the exact data value of interest [Segel and Heer, 2010]. These techniques also support tasks such as data diagnostics, pattern discovery, and hypothesis formation [Hegarty, 2011]. In addition, interactive visualizations can encourage author-reader interaction by inviting readers to freely explore specific data or details within a larger framework set up by the author [Segel and Heer, 2010].

Nonetheless, empirical evidence on the actual effectiveness of interactive visualizations is mixed. While some suggested that interactive graphics are superior to static ones, especially for situations where people are asked to track moving objects within a display or to follow data trends over time [Heer and Robertson, 2007], others argued this might not be true if the interactions were too complex [Tversky, Morrison and Betrancourt, 2002]. Recent research suggested that interactive visualizations only augment comprehension when they allow users to (a) offload internal mental computations onto external manipulations of the display itself; and (b) filter out task-irrelevant information [Hegarty, 2011].

With respect to the potential relationship between visual interactivity and perceived data credibility, people may perceive the data to be more credible when viewing an interactive display because of the precision and autonomy it affords. Interactive visualizations usually offer a greater level of precision than static ones. For example, an individual may conclude that the population of a region lies between 40,000 and 50,000 based on her quick reading of a static map; with an interactive display, the same individual can easily figure out that the exact population of the region is 42,317 [Maciejewski, 2011]. Since humans often misinterpret precision as accuracy, people viewing an interactive graph that shows precise numbers may perceive it to be accurate and hence attach more credibility to it [Wang and Strong, 1996].

In addition, since interactive displays encourage people to explore the data and make sense of it by themselves, therefore empowering them, this type of visual displays may increase the perception of credibility. Previous research suggested that when people are prompted to achieve an autonomous understanding of mediated information, they tend to assign more importance and credibility to it [Hall Jamieson and Hardy, 2014; Sillence et al., 2007]. We therefore propose the following hypothesis:

H1b: perceived credibility of visualized data is higher when the data is presented in an interactive graph than when it is presented in a static graph.

2.3 Source attribution and trust

As with any type of mediated information, people can rely on peripheral cues, such as source attribution, to judge the credibility of visualized data. Atkinson and Rosenthal [2014], for instance, presented participants with eco-labels certified by either the United States Department of Agriculture (USDA), or the product manufacturer. Results showed that consumers found the USDA label more trustworthy than the corporate label, and developed more favorable attitude toward the USDA-labeled product. Similarly, participants were more inclined to believe a science story from an .edu site (indicating a website from a higher education institution), than a .gov site (indicating a government website) [Treise et al., 2003].

Indeed, human beings are cognitive misers, or at least satisfiers, who collect only as much information about a topic as they think is necessary to reach a decision [Popkin, 1991]. Therefore, when facing a situation in which they do not have enough information to judge the credibility of a dataset, people will make an informed guess based on their confidence in the source. Particularly for scientific topics that are remote from everyday experience and characterized by highly technical discourse, people are likely to engage in heuristic processing and rely on endorsement from experts to make judgments [Brossard and Nisbet, 2007]. Previous research found that the American public has different levels of trust in social institutions (e.g., university scientists, federal agencies, and regulators) regarding the development of risky technologies. Generally, university scientists are rated more favorably than federal agencies as sources of risk-related information [e.g. Whitfield et al., 2009]. Therefore, we hypothesize that the perceived credibility of visualized data will vary as a function of source attribution:

H2a: perceived credibility of visualized data varies as a function of source attribution.

However, this hypothesized relationship between source attribution and data credibility might be conditional on an individual’s trust in the source. For example, those who assign equal levels of trust to university scientists and governmental agencies might then ascribe similar levels of confidence in evidence attributed to each of them, rather than rating university scientists higher. In fact, heuristic cues (or “mental shortcuts”) work most effectively when they resonate with long-term schemas held by audiences [see Scheufele and Iyengar, 2013, for a review]. However, this sensitivity to source manipulation applies only when the embedded heuristic (i.e., source attribution) is relevant to individuals’ underlying schema. For example, if university scientists were perceived more trustworthy than governmental agencies, people might think information from the former party is more credible than from the latter. It is also possible for those who favor governmental agencies to perceive their data as more credible than that from university scientists. A hypothesis is therefore proposed:

H2b: the relationship between perceived credibility of visualized data and source attribution varies for people with different levels of trust in the given sources.

2.4 Self-assessed design quality

In addition to extrinsic factors, such as design characteristics and source attribution, visualized evidence evaluations may also be influenced by perceived design quality. Design quality, in our case, refers to individuals’ subjective evaluations of whether the information is presented in a visually clear and concise manner based on design elements (e.g., color, font, layout etc.). During an initial scan by an individual, visualizations are usually viewed as one holistic message. Champlin et al. [2014] argue that visual media “is first viewed as a whole before drilling down to interpret the content word by word or through specific visual graphics” (p. 285). After the initial holistic interpretation, judgments and impressions about visual messages often focus on clarity and complexity [Champlin et al., 2014].

Information clarity, or the extent to which the information can be easily understood, is frequently mentioned by Internet users when asked to evaluate a site’s credibility [Sillence et al., 2007]. Additionally, the presence of a “moderately complex” layout, which can be achieved by a deliberate balance of information and graphic design elements, suggests greater design quality for a visual message [Geissler, Zinkhan and Watson, 2006]. Research showed that health advertisements of a mid-level design complexity consistently received more positive evaluations (e.g., like it more, easier to understand, and includes more information about health) than either more or less complex advertisements [Champlin et al., 2014]. Also, digitalized health messages with higher design quality led viewers to perceive the content to be more informative [Lazard and Mackert, 2014]. A hypothesis is therefore proposed:

H3: perception of credibility for visualized data positively relates to viewers’ subjective evaluation of the graph’s design quality.

2.5 Comprehension

Another factor that might influence perception of credibility is the extent to which viewers could comprehend the visualized content. While comprehension can mean various things in different contexts, we focus on translating and interpreting visualized evidence [Shah, Mayer and Hegarty, 1999]. Translation means to describe the visualized content and to identify specific value of interest. Interpretation, in contrast, means to look for relationships among variables and to sort out important factors [Shah, Mayer and Hegarty, 1999].

So far, research examining the relationship between comprehension and perceived credibility of visualized data found conflicting results. One study showed that giving audiences information through visuals with the intent of enhancing understanding of the shown data can help increase perceived credibility of science [Hall Jamieson and Hardy, 2014]. Yet, another study found that stories including a graphic are rated as less trustworthy than the same story without it, despite improvement in understanding the conveyed numerical information associated with the graphic [Johnson and Slovic, 1995]. Given these mixing findings, we propose a research question (RQ) regarding the relationship between comprehension and perceived data credibility:

RQ1: what is the relationship between viewers’ perceived credibility of visualized data and their comprehension?

2.6 Predispositions

Predispositions, including graph efficacy, numeracy skills, and domain knowledge, may potentially influence perception of credibility given their intrinsic relationships with comprehension. Cognitive psychologists have long contended that comprehension hinges on graph efficacy, which refers to people’s perceived capabilities to comprehend graphically represented information [Galesic and Garcia-Retamero, 2011]. More than just an assessment of task-specific abilities, graph efficacy predicts how well people can understand a given standard graph [Garcia-Retamero and Galesic, 2010]. In a similar vein, numeracy skills, a measure of the ability and disposition to make use of quantitative information, also influences comprehension in visual contexts [Fagerlin et al., 2007; Garcia-Retamero and Galesic, 2010]. Research shows that people with low graph efficacy often have low numeracy skills ratings. Predictably, graphic tools help low-numeracy people with relatively high graph literacy to understand the results of a randomized experiment, but do not help those with low graph literacy [Garcia-Retamero and Galesic, 2010]. Additionally, knowledge about a specific topic of interest, which helps people direct their attention to task-relevant information while ignoring irrelevant information in a visual display, shapes understandings of the displayed content [Hegarty, Canham and Fabrikant, 2010]. To factor out the potential implications of these dispositional factors on the perception of credibility, we included a self-reported measure of graph efficacy, numeracy skills and domain knowledge as independent variables in the model.

3 Methods

3.1 Participants

We recruited participants from a number of courses at a large Midwestern university in May 2014 and asked them to complete a computer-assisted experiment at one of the two designated locations on campus. Notably, the state where the university locates had only one operating nuclear power plant in 2014, producing 14% of the state’s electricity [Public Service Commission of Wisconsin, 2013]. In fact, the state Assembly passed a bill in 2016 lifting a restriction on new nuclear power plant, which would “place nuclear power ahead of natural gas, oil and coal on the state’s prioritized list of energy sources” [Beck and DeFour, 2016, para. 2].

Upon survey completion, participants received extra course credit as compensation and were given a short debrief after participation. In total, 517 valid responses were collected. Participants majored in a wide variety of fields, ranging from natural sciences or engineering (28.7%) to humanities (31.9%) and social sciences (32.9%). Most participants (98.1%) were between 18 to 24 years old (M=20.3, SD=5.3). Sixty-four percent of participants were female. Noticeably, ninety-five percent of participants had taken at least one, while 17.4% had taken more than five, college-level courses in the field of science or engineering. As participants who had more formal education in scientific fields might be more familiar with data visualization and its conventions, we included the number of science courses as a control variable to factor out potential confounding effects.

3.2 Procedure

During the experiment, participants were first asked about their knowledge of and attitudes toward the nuclear fuel cycle development, as well as trust in various social institutions. Then they were randomly assigned to one of eight conditions. Each condition included viewing a long-term projection of the performance of three different nuclear fuel cycles between 2000 and 2100.1 Each comparison focused on either (a) the projected volume of waste streams produced by each fuel cycle or (b) the cost projections for waste disposal. Researchers specializing in nuclear engineering provided the simulated data and collaborated in designing the stimuli.

Individuals participated in the experiment in a lab setting and did not have access to external sources of information. While viewing a specific graph, participants were asked to retrieve numerical values and to answer questions about the characteristics and performance of different fuel cycle options. After finishing the tasks, participants reported how credible the shown data was, evaluated the design quality, and answered questions measuring numeracy skills and assessing demographics.

3.3 Conditions

The experimental stimuli followed a between subjects 2 (traditional area chart vs. innovative proportional area chart) ×2 (static vs. dynamic) ×2 (university scientist vs. governmental agency) design. Within each of the conditions, three separate charts representing information for three different fuel cycle options were juxtaposed. Each stimulus included a brief introduction about either the costs or the radioactive waste associated with the nuclear fuel cycle in question. Additional information was provided about each specific type of cost/waste shown in the stimuli.

Graph format. In the traditional area chart conditions, data were plotted in an x-y plane, with the filled area representing the distribution of cost projections/waste volume (y-axis) across the time period (x-axis) (see Figure 1). In contrast, each proportional area chart (also known as a bubble chart) included a hierarchical array of circles representing various types of cost projections/waste volume associated with each fuel cycle, the size of which was proportional to the data value (see Figure 2). This graph type was adapted from real data visualizations showing similar information on carbon emissions and budget proposal [Lavelle, 2013; Shan, 2012]. Other visual characteristics, including color themes, font type and size, and layout, were held constant across conditions to rule out any potential confounding impacts.


PIC

Figure 1: Dynamic area chart showing the costs of waste storage and disposal for three nuclear fuel cycles between 2000 and 2010.



PIC

Figure 2: Dynamic proportional area chart showing the costs of waste storage and disposal for three nuclear fuel cycles between 2000 and 2100.


Interactivity. To manipulate the degree of interactivity, we created dynamic and static versions for both types of graphs. While the static and dynamic area charts contained the same information, participants could retrieve the exact data value only when viewing the dynamic graph. Specifically, for the dynamic area chart, participants could hover their cursors over the plot area to display a pop-up square containing the y-coordinate value (i.e., cost or waste volume) for each x-coordinate (i.e., year) (see Figure 1). When viewing a dynamic bubble chart, participants could adjust an animated slider controlling the timeframe and view data for a specific year (see Figure 2). Differing from traditional area charts, bubble charts represented data for one year at a time rather than showing the overall distribution in a single graphic. For this reason, as it was not possible to represent all of the data in a single static bubble chart (analogous to the complete data displayed in the static traditional area chart), it only contained minimal information (i.e., data for 2000, 2050, and 2100) that allowed participants to answer the comprehension questions.

Data source. Additionally, a data source manipulation was included to prompt participants to ascribe the shown data to different institutions. In the stimuli, we included a logo from either the Massachusetts Institute of Technology (MIT) to represent a university source or the U.S. Department of Energy (DOE) as a governmental source, both institutions likely to be sources for energy related data.

3.4 Measures

Dependent variable. Perceived data credibility was measured using a five-point scale (1 = strongly disagree, 3 = neither agree nor disagree, 5 = strongly agree), asking participants the following statements, “the data are trustworthy,” “the data are produced by a reputable source,” “the data are accurate,” “the data are error-free,” “the data are incorrect” (reverse coded), “the data are unbiased,” and “the data are objective.” We averaged the six items to create an index with scores ranging from 1 to 5 (M=3.33, SD=.43, Cronbach’s alpha=.72).

Independent variables. Comprehension was measured by six multiple-choice questions. Three questions asked participants to identify specific data points, such as “(What was the cost of wet storage and dry storage/How much wet storage and dry storage generated) for the Nuclear Fuel Cycle 1 in 2000?” The other three questions asked participants to interpret the graph by comparing data points, such as “Among the three nuclear fuel cycles, which one (costs most/generates the most total waste) in 2000?” and “On average, which nuclear fuel cycle costs most over time? Nuclear Fuel Cycle 1, 2, or 3.” An index (range 0–6) was created based on the cumulative number of correct answers (M=4.77, SD=1.46, Kuder-Richardson Formula 20=.622).

Self-assessed design quality was measured by seven items using a five-point scale (1 = strongly disagree, 3 = neither agree nor disagree, 5 = strongly agree), asking participants if they think the graph “is interpretable,” “shows a clear picture of the data,” “is easy to understand,” “is readable,” “represents the data well,” “is concise,” and “organizes the data well.” We averaged these items to form an index, ranging from 1 to 5 (M=3.82, SD=.71, Cronbach’s alpha=.91).

Relative trust in university scientists versus governmental agencies was operationalized as the difference in scores between individuals’ trust in university scientists and that in governmental agencies. Participants were asked to indicate their trust in different institutions “to tell the truth about the risks and benefits associated with the nuclear fuel cycle” on a five-point scale (1 = do not trust their information at all, 5 = trust their information very much). A difference score was calculated for each individual by subtracting trust in “federal agencies, such as the U.S Department of Energy” from that in “university scientists” (M=.36, SD=.96). A breakdown shows that 20.9% of the subjects trusted federal agencies more than university scientists, 24.4% saw them as equally trustworthy, and 54.7% expressed more trust in university scientists.

Self-reported graph efficacy was measured based on a modified version of a computer efficacy measure (i.e., individuals’ beliefs about their abilities to competently use computers) [Compeau and Higgins, 1995]. Four items were asked on a five-point scale (1 = strongly disagree, 3 = neither agree nor disagree, 5 = strongly agree), including “I believe I have the ability to (understand data points in a graph/identify trends shown in a graph/make appropriate decisions based on a graph)” and “I could understand a graph even if there was no one around telling me what to do.” These items were averaged to create an index (M=3.79, SD=.70, Cronbach’s alpha=.84).

Subjective numeracy skills was adapted from Fagerlin et al. [2007]’s subjective numeracy scale. Three questions asked participants to indicate their agreement with the following statements: “I am good at (working with fractions/working with percentages/calculating a 15% tip)” (1 = strongly disagree, 3 = neither agree nor disagree, 5 = strongly agree); one question asked “when people tell you the chance of something happening, do you prefer that they use words or number” (1 = prefer words, 5 = prefer numbers; and one asked “when you hear a weather forecast, do you prefer predictions using percentages or predictions using only words?” (1 = prefer percentages, 5 = prefer words; reverse coded). An index was created based on the average score (M=3.58, SD=.72, Cronbach’s alpha=.69).

Self-reported domain knowledge was measured using a five-point scale (1 = very unfamiliar, 3 = neither familiar nor unfamiliar, 5 = very familiar) asking participants how familiar they felt they were with “nuclear energy production,” “health implications of nuclear energy,” “environmental implications of nuclear energy,” “nuclear waste management,” and “economics of nuclear power-related facilities.” We averaged these items to form an index (M=2.6, SD=1.02, Cronbach’s alpha=.90).

In addition, age, gender, the field of one’s academic major (0 = social sciences/humanities/business/medical sciences, 1 = engineering/natural sciences), and the number of science courses (M=5.02, SD=3.72) taken in college were added as control variables to avoid any potentially confounding effect on the outcome.

3.5 Analytical framework

We analyzed the data using hierarchical Ordinary Least Square (OLS) regression model. Independent variables were entered in blocks to determine their relative explanatory power. The first block included three dichotomous variables representing each experimental treatment (i.e., graph format, interactivity, and source attribution). A number of control variables, including age, gender, major field, and the number of science courses were added in Block 2. Block 3 contained predispositions whereas Block 4 included graph comprehension and perceived design quality. To examine the hypothesized interactive effect of source attribution and relative trust on perceived data credibility, we created an interaction term by multiplying source attribution and the standardized score of relative trust (Block 5). This was done to help prevent multicollinearity between the interaction term and its component parts [Cohen and Cohen, 1975].

4 Results

Overall, the model explained 14.9% of the variation in perceived data credibility (see Table 1). Age was negatively related to perceived credibility of the visualized data (β=.14, p<.001), indicating that younger participants were more likely to think the presented data are credible than older ones.


Table 1: OLS regression model predicting data credibility.
PIC

H1a and H1b addressed the potential influences of graph format and graph interactivity on perceived data credibility. H1a was not supported, as viewers’ perceived credibility did not vary when they were shown different graph formats. While interactivity was related to the dependent variable at a significant level (β=.09, p=.046), the relationship was negative and indicated that people were less likely to think the data was credible when viewing a dynamic graph than when viewing a static one, which contradicted what we proposed. Therefore, we failed to approve H1b.

Nonetheless, consistent with H2a, source attribution influenced how people assessed data credibility. In particular, people who viewed data attributed to MIT perceive significantly higher credibility than those viewing data attributed to DOE (β=.14, p=.001). In addition, H2b, which proposed differentiating effects of source attribution on people with varying levels of trust in data sources, also received substantial support. Results showed that people who trusted university scientist more than governmental agency were more likely to think the data was credible when it was attributed to MIT than to DOE. For those who assigned equal amount of trust to both parties or who trusted governmental agencies more, their perceived data credibility does not differ across treatment conditions (see Figure 3).


PIC

Figure 3: Interactive effect of relative trust and source attribution on perceived data credibility.


H3 proposed that perceived data credibility is positively related to self-assessment of graph quality, which was supported by a significant positive relationship between the two variables (β=.16, p=.001). However, graph comprehension, which measures the accuracy of viewers’ understanding of the stimuli, did not significantly relate to perceived credibility. Lastly, graph efficacy, which measured self-reported ability to read and use graphical tools, was positively related to the outcome variable (β=.10, p=.037).

5 Discussion

Science communicators and scholars have expressed increasing interests in leveraging visual techniques to represent complex databased information about scientific issues, such as climate change and risky technologies. However, despite the growth of such interventions in various contexts, including journalistic reporting [Dixon et al., 2015], classroom teaching [Teoh and Neo, 2007], and user-centered design [Rodríguez Estrada and Davis, 2015], little is known about how people judge the accuracy and trustworthiness of information based on display characteristics and individual predispositions. Drawing from theories developed in various fields such as visual cognition, human-computer interaction, marketing, and science communication, we propose a conceptual framework that captures some of the cognitive process underlying perceptions of credibility of displayed scientific data.

Before discussing our findings in detail, we should note a number of methodological considerations. First, we used only one issue (i.e., nuclear fuel cycles) to test the proposed framework, which could potentially limit the generalizability of our findings. Future researchers would need to replicate this study using a variety of other issues to verify the validity of the proposed framework. In particular, individuals’ preexisting attitudes toward the issue might interfere with how they interpret the shown data. Further research needs to examine how people’s preexisting attitudes might play a role in shaping their processing of visualized data.

Contrary to what we expected, visual format and interactivity were not related to the perception of data credibility. Although we carefully chose these two types of displays (i.e., area graph and proportional area graph) based on their popularity and comparability, they might not differ drastically in how “scientific” they look to our participants, who were a group of college students majoring in both science and non-science fields. Especially given students’ low familiarity with the nuclear fuel cycle, they might lack an intuitive sense of how this type of data was typically presented and hence viewed the two given displays as equivalently legitimate and acceptable. In addition, while the results suggested that perceived credibility of visualized data varies as a function of source attribution, such relationship might manifest differently for different populations. For example, although student participants found the MIT-sourced data more credible than the DOE-sourced one, the opposite might be true for people working in the nuclear industry.

Second, we manipulated interactivity along two dimensions, including animation and precision. Compared with static displays, interactive visualizations allow users to filter out task-irrelevant information while obtaining numerical information in greater precision. However, these are not the only ways in which interactivity can function in real visualization design. The effects of other interactive features, such as animated slideshows and drill-down stories, should be studied in future research. Noticeably, the proposed framework only explained 15% variation in the dependent variable; researchers might want to incorporate additional factors, such as issue involvement and perceived persuasive intent, in future to develop a more robust model.

With these limitations in mind, our study generated important, two-fold findings. First, individuals with limited knowledge about a scientific issue, such as the nuclear fuel cycle, tend to rely on heuristic cues, such as design quality and source attribution, to judge the credibility of visualized data. Researchers have long contended that design quality serves as a heuristic cue for the viewer to assess the quality and trustworthiness of the information displayed [Champlin et al., 2014; Sillence et al., 2007]. This study demonstrates that, independently of the actual visual format in which data is represented, people ascribe more credibility to data shown in a display judged to provide a clearer and more concise picture of the data. It should be noted that our conceptualization of design quality refers to individuals’ subjective evaluations of design quality, not the actual presence and presentation of design elements, such as color, font, object size and layout [Champlin et al., 2014]. While our manipulation of graph format reflects, to some extent, a different representation of such elements, its effect on the perception of data credibility is minimal. Further research is required to understand the differentiating impact of objective and perceived design quality on the perception of data credibility.

Interestingly, even though it was presented in a form that was peripheral to the central message (i.e., through organizational logos), the source of the data was noticeable to participants. About one quarter of the respondents accurately identified the source of visualized data when it was presented as organizational logos. People responded to source cues differently based on their deeply held attitudes. When certain cues (i.e., logo of a prestigious university) resonated with individuals’ preexisting beliefs (i.e., university scientists are more trustworthy than governmental agencies as information sources), they assigned more credibility to data attributed to their preferred source, even though the content was the same.

Data professionals and designers have previously highlighted the importance of labeling data sources to assure audiences of the credibility and integrity of graphical displays [e.g., Tufte, 1992]. Our study extends this observation by showing that using an iconic label to display the source of data not only cues people about the credibility of graphical displays as a persuasive device, but also influences how they judge the credibility of the shown information.

Noticeably, while recent voices proposed leveraging the credibility of scientists through visualizing techniques that invite audiences to comprehend the evidence with autonomy [Hall Jamieson and Hardy, 2014], the link between comprehension and the perception of data credibility did not receive sufficient support from this study. Therefore, we did not find whether perceived credibility of visualized data would be positively relating to viewers’ comprehension of the same data. In other words, we were not able to approve if a legitimate interpretation of the shown data would lead people to think the data is true or perceive it to be highly credible. In fact, the positive relationship between comprehension and perceived credibility became non-significant only after we entered self-assessed design quality in the equation. Arguably, an intuitive judgement of whether the data is accessible and digestible in its current form plays a more important role in determining viewers’ perception of data credibility than whether they actually understand it.

As an emerging genre of popular discourse, information visualization has been increasingly used to convey scientific data. While some tentative evidences had showed the potential power of visual communication in engaging audiences while diminishing identity-protective cognition, we lacked a thorough understanding of the underlying mechanism and therefore ran short of advices for science communication practices. This study took an initial step in identifying some of the design factors that might come into play and constructing an encompassing framework that accounts the roles of values and predispositions.

For scientists, data professionals and designers, the major task is not only to meet the aesthetic and efficiency goals when creating visualizations, but also to understand the audiences’ background and cognitive needs. For example, to make an information visualization appear credible to target audiences, one might want to investigate the source deemed most trustworthy by target audience and incorporate it into the visual narratives. In addition, although modern technologies equip communicators to present data in vivid, innovative, and dynamic formats, they need to assure that such visuals do not distract or confuse viewers; otherwise, it can be useless or perceived untrustworthy.

References

Animation World Network (2015). The power of 3D in biomedical visualization. URL: http://www.awn.com/vfxworld/power-3d-biomedical-visualization (visited on 16th February 2016).

Atkinson, L. and Rosenthal, S. (2014). ‘Signaling the green sell: the influence of eco-label source, argument specificity, and product involvement on consumer trust’. Journal of Advertising 43 (1), pp. 33–45. https://doi.org/10.1080/00913367.2013.834803.

Beck, M. and DeFour, M. (13th January 2016). ‘Assembly approves lifting restrictions on new nucler power plants’. Wisconsin State Journal. URL: http://host.madison.com/wsj/news/local/govt-and-politics/assembly-approves-lifting-restrictions-on-new-nuclear-power-plants/article_a85fbd24-7232-5423-98bd-65b3c6fc83f9.html.

Brossard, D. and Nisbet, M. C. (2007). ‘Deference to Scientific Authority Among a Low Information Public: Understanding U.S. Opinion on Agricultural Biotechnology’. International Journal of Public Opinion Research 19 (1), pp. 24–52. https://doi.org/10.1093/ijpor/edl003.

Card, S. K., Mackinlay, J. D. and Shneiderman, B. (1999). Readings in information visualization: using vision to think. U.S.A.: Morgan Kaufmann.

Champlin, S., Lazard, A., Mackert, M. and Pasch, K. E. (2014). ‘Perceptions of design quality: an eye tracking study of attention and appeal in health advertisements’. Journal of Communication in Healthcare 7 (4), pp. 285–294. https://doi.org/10.1179/1753807614y.0000000065.

Cohen, J. and Cohen, P. (1975). Applied multiple regression/correlation analysis for the behavioral sciences. Oxford, U.K.: Lawrence Erlbaum.

Compeau, D. R. and Higgins, C. A. (1995). ‘Computer self-efficacy: development of a measure and initial test’. MIS Quarterly 19 (2), p. 189. https://doi.org/10.2307/249688.

Dixon, G. N., McKeever, B. W., Holton, A. E., Clarke, C. and Eosco, G. (2015). ‘The power of a picture: overcoming scientific misinformation by communicating weight-of-evidence information with visual exemplars’. Journal of Communication 65 (4), pp. 639–659. https://doi.org/10.1111/jcom.12159.

Fagerlin, A., Zikmund-Fisher, B. J., Ubel, P. A., Jankovic, A., Derry, H. A. and Smith, D. M. (2007). ‘Measuring numeracy without a math test: development of the subjective numeracy scale’. Medical Decision Making 27 (5), pp. 672–680. https://doi.org/10.1177/0272989x07304449.

Fisher, B., Green, T. M. and Arias-Hernández, R. (2011). ‘Visual analytics as a translational cognitive science’. Topics in Cognitive Science 3 (3), pp. 609–625. https://doi.org/10.1111/j.1756-8765.2011.01148.x.

Flanagan, R. and Schneider, E. (2013). ‘Input visualization for the Cyclus nuclear fuel cycle simulator: CYClus Input Control’. In: GLOBAL 2013: International Nuclear Fuel Cycle Conference — Nuclear Energy at a Crossroads. Salt Lake City, UT, U.S.A.

Fogg, B. J., Soohoo, C., Danielson, D. R., Marable, L., Stanford, J. and Tauber, E. R. (2003). ‘How do users evaluate the credibility of web sites? A study with over 2,500 participants’. In: Proceedings of the 2003 conference on Designing for user experiences — DUX ’03, pp. 1–15. https://doi.org/10.1145/997078.997097.

Galesic, M. and Garcia-Retamero, R. (2011). ‘Graph literacy: a cross-cultural comparison’. Medical Decision Making 31 (3), pp. 444–457. https://doi.org/10.1177/0272989x10373805.

Garcia-Retamero, R. and Galesic, M. (2010). ‘Who proficts from visual aids: overcoming challenges in people’s understanding of risks’. Social Science & Medicine 70 (7), pp. 1019–1025. https://doi.org/10.1016/j.socscimed.2009.11.031.

Geissler, G. L., Zinkhan, G. M. and Watson, R. T. (2006). ‘The influence of home page complexity on consumer attention, attitudes and purchase intent’. Journal of Advertising 35 (2), pp. 69–80. https://doi.org/10.1080/00913367.2006.10639232.

Hagtvedt, H. (2011). ‘The impact of incomplete typeface logos on perceptions of the firm’. Journal of Marketing 75 (4), pp. 86–93. https://doi.org/10.1509/jmkg.75.4.86.

Hall Jamieson, K. and Hardy, B. W. (2014). ‘Leveraging scientific credibility about Arctic sea ice trends in a polarized political environment’. Proceedings of the National Academy of Sciences 111 (supplement 4), pp. 13598–13605. https://doi.org/10.1073/pnas.1320868111.

Heer, J. and Robertson, G. (2007). ‘Animated transitions in statistical data graphics’. IEEE Transactions on Visualization and Computer Graphics 13 (6), pp. 1240–1247. https://doi.org/10.1109/tvcg.2007.70539.

Hegarty, M. (2011). ‘The cognitive science of visual-spatial displays: implications for design’. Topics in Cognitive Science 3 (3), pp. 446–474. https://doi.org/10.1111/j.1756-8765.2011.01150.x.

Hegarty, M., Canham, M. S. and Fabrikant, S. I. (2010). ‘Thinking about the weather: how display salience and knowledge affect performance in a graphic inference task’. Journal of Experimental Psychology: Learning, Memory and Cognition 36 (1), pp. 37–53. https://doi.org/10.1037/a0017683.

Johnson, B. B. and Slovic, P. (1995). ‘Presenting uncertainty in health risk assessment: initial studies of its effects on risk perception and trust’. Risk Analysis 15 (4), pp. 485–494. https://doi.org/10.1111/j.1539-6924.1995.tb00341.x.

Kitada, A. (2016). ‘Public opinion changes after the Fukushima Daiichi Nuclear Power Plant accident to nuclear power generation as seen in continuous polls over the past 30 years’. Journal of Nuclear Science and Technology 53 (11), pp. 1686–1700. https://doi.org/10.1080/00223131.2016.1175391.

Lavelle, M. (22nd November 2013). ‘The changing carbon map: how we revised our interactive look at global footprints’. National Geographic. URL: https://www.nationalgeographic.com/environment/great-energy-challenge/2013/the-changing-carbon-map-how-we-revised-our-interactive-look-at-global-footprints/ (visited on 8th June 2018).

Lazard, A. and Mackert, M. (2014). ‘User evaluations of design complexity: the impact of visual perceptions for effective online health communication’. International Journal of Medical Informatics 83 (10), pp. 726–735. https://doi.org/10.1016/j.ijmedinf.2014.06.010.

Li, N., Akin, H., Su, L. Y.-F., Brossard, D., Xenos, M. and Scheufele, D. (2016). ‘Tweeting disaster: an analysis of online discourse about nuclear power in the wake of the Fukushima Daiichi nuclear accident’. JCOM 15 (05), A02.

Li, N., Brossard, D., Scheufele, D. A. and Wilson, P. P. H. (2018). ‘Policymakers and stakeholders’ perceptions of science-driven nuclear energy policy’. Nuclear Engineering and Technology. https://doi.org/10.1016/j.net.2018.03.012.

Lowry, P. B., Wilson, D. W. and Haig, W. L. (2014). ‘A picture is worth a thousand words: source credibility theory applied to logo and website design for heightened credibility and consumer trust’. International Journal of Human-Computer Interaction 30 (1), pp. 63–93. https://doi.org/10.1080/10447318.2013.839899.

Maciejewski, R. (2011). ‘Data Representations, Transformations, and Statistics for Visual Reasoning’. Synthesis Lectures on Visualization 2 (1), pp. 1–85. https://doi.org/10.2200/s00357ed1v01y201105vis002.

O’Neill, S. J. and Smith, N. (2014). ‘Climate change and visual imagery’. Wiley Interdisciplinary Reviews: Climate Change 5 (1), pp. 73–87. https://doi.org/10.1002/wcc.249.

Popkin, S. L. (1991). The reasoning voter: communication and persuasion in presidential campaigns. Chicago, IL, U.S.A.: University of Chicago Press.

Public Service Commission of Wisconsin (December 2013). Nuclear power plants and radioactive waste management in Wisconsin. URL: https://psc.wi.gov/Documents/Brochures/Nuclear%20Power%20Plant.pdf.

Rodríguez Estrada, F. C. and Davis, L. S. (2015). ‘Improving Visual Communication of Science Through the Incorporation of Graphic Design Theories and Practices Into Science Communication’. Science Communication 37 (1), pp. 140–148. https://doi.org/10.1177/1075547014562914.

Scheufele, D. A. and Iyengar, S. (2013). ‘The state of framing research: a call for new direction’. The Oxford Handbook of Political Communication Theories, pp. 1–27. https://doi.org/10.1093/oxfordhb/9780199793471.013.47.

Segel, E. and Heer, J. (2010). ‘Narrative visualization: telling stories with data’. IEEE Transactions on Visualization and Computer Graphics 16 (6), pp. 1139–1148. https://doi.org/10.1109/tvcg.2010.179.

Shah, P., Mayer, R. E. and Hegarty, M. (1999). ‘Graphs as aids to knowledge construction: signaling techniques for guiding the process of graph comprehension’. Journal of Educational Psychology 91 (4), pp. 690–702. https://doi.org/10.1037/0022-0663.91.4.690.

Shan, C. (2012). ‘Four ways to slice Obama’s 2013 budget proposal’. New York Times. URL: http://www.nytimes.com/interactive/2012/02/13/us/politics/2013-budget-proposal-graphic.html (visited on 2nd November 2014).

Sillence, E., Briggs, P., Harris, P. and Fishwick, L. (2007). ‘Health websites that people can trust — the case of hypertension’. Interacting with Computers 19 (1), pp. 32–42. https://doi.org/10.1016/j.intcom.2006.07.009.

Tal, A. and Wansink, B. (2016). ‘Blinded with science: trivial graphs and formulas increase ad persuasiveness and belief in product efficacy’. Public Understanding of Science 25 (1), pp. 117–125. https://doi.org/10.1177/0963662514549688.

Teoh, B. and Neo, T. (2007). ‘Interactive multimedia learning: students’ attitudes and learning impact in an animation course’. The Turkish Online Journal of Educational Technology 6 (4), pp. 28–37. URL: http://www.tojet.net/articles/v6i4/643.pdf.

Thomas, J. J. and Cook, K. A. (2005). Illuminating the path: the research and development agenda for visual analytics. National Visualization and Analytics Center. URL: https://vis.pnnl.gov/pdf/RD_Agenda_VisualAnalytics.pdf.

Treise, D., Walsh-Childers, K., Weigold, M. F. and Friedman, M. (2003). ‘Cultivating the science internet audience’. Science Communication 24 (3), pp. 309–332. https://doi.org/10.1177/1075547002250298.

Tufte, E. R. (1992). The visual display of quantitative information. Graphics Press.

Tversky, B., Morrison, J. B. and Betrancourt, M. (2002). ‘Animation: can it facilitate?’ International Journal of Human-Computer Studies 57 (4), pp. 247–262. https://doi.org/10.1006/ijhc.2002.1017.

van der Linden, S. L., Clarke, C. E. and Maibach, E. W. (2015). ‘Highlighting consensus among medical scientists increases public support for vaccines: evidence from a randomized experiment’. BMC Public Health 15 (1), p. 1207. https://doi.org/10.1186/s12889-015-2541-4.

van der Linden, S. L., Leiserowitz, A. A., Feinberg, G. D. and Maibach, E. W. (2014). ‘How to communicate the scientific consensus on climate change: plain facts, pie charts or metaphors?’ Climatic Change 126 (1–2), pp. 255–262. https://doi.org/10.1007/s10584-014-1190-4.

Wang, R. Y. and Strong, D. M. (1996). ‘Beyond accuracy: what data quality means to data consumers’. Journal of Management Information Systems 12 (4), pp. 5–33. https://doi.org/10.1080/07421222.1996.11518099.

Welles, B. F. and Meirelles, I. (2014). ‘Visualizing computational social science’. Science Communication 37 (1), pp. 34–58. https://doi.org/10.1177/1075547014556540.

Whitfield, S. C., Rosa, E. A., Dan, A. and Dietz, T. (2009). ‘The future of nuclear power: value orientations and risk perception’. Risk Analysis 29 (3), pp. 425–437. https://doi.org/10.1111/j.1539-6924.2008.01155.x.

Wilson, P. P. (2011). Comparing nuclear fuel cycle options: Observations and challenges. A Report for the Reactor & Fuel Cycle Technology Subcommittee of the Blue Ribbon Commission on America’s Nuclear Future. URL: http://cybercemetery.unt.edu/archive/brc/20120620221039/http://brc.gov/sites/default/files/documents/wilson.fuel_.cycle_.comparisons_final.pdf.

World Nuclear Association (March 2018). Nuclear power in the U.S.A. URL: http://www.world-nuclear.org/information-library/country-profiles/countries-t-z/usa-nuclear-power.aspx.

Yee, V. and McGeehan, P. (6th January 2017). ‘Indian Point nuclear power plant could close by 2021’. New York Times. URL: https://www.nytimes.com/2017/01/06/nyregion/indian-point-nuclear-power-plant-shutdown.html.

Authors

Nan Li is assistant professor in the Department of Agricultural Education and Communications at the Texas Tech University. E-mail: nan.li@ttu.edu.

Dominique Brossard is professor and chair in the Department of Life Sciences Communication at the University of Wisconsin-Madison. E-mail: dbrossard@wisc.edu.

Dietram A. Scheufele is the John E. Ross Professor in the Department of Life Sciences Communication at the University of Wisconsin-Madison. E-mail: scheufele@gmail.com.

Paul H. Wilson is the Grainger Professor in the Department of Engineering Physics at the University of Wisconsin-Madison. E-mail: paul.wilson@wisc.edu.

Kathleen M. Rose is a Ph.D. student in the Department of Life Sciences Communication at the University of Wisconsin-Madison. E-mail: kmrose@wisc.edu.

How to cite

Li, N., Brossard, D., Scheufele, D. A., Wilson, P. H. and Rose, K. M. (2018). ‘Communicating data: interactive infographics, scientific data and credibility’. JCOM 17 (02), A06. https://doi.org/10.22323/2.17020206.

Endnotes

1The manipulation of graph content (i.e. waste cost or waste volume) was added in the model as a control variable.

2Kuder-Richardson Formula (KR-20) is a measure of internal consistency reliability for measures with dichotomous choices.