Data alert! A bit about graphs, maps, images, risk, statistics and uncertainty

Graphics and visuals like maps, charts, and timelines make information easy to understand and process. What might take paragraphs can be summarized in one image. Now, online graphics can be interactive, allowing readers an opportunity to explore data. Graphics can also be misleading.

Is the time scale appropriate for the trend being presented? Does the graph show all of the data, or a narrow window to convey a skewed picture? There is always more evidence than what is presented or published, but the key issue is whether evidence selection has compromised the true account of the underlying data (Tufte 07).

In maps, “large scale” means zoomed-in, detailed. “Small scale” means zoomed out, general. Is the type of map appropriate for the data being presented?

Look at the categories and the legend. Maps can be manipulated to show what you want.

Photos are easily mismatched to the text and the headline.

Risk is a possibility that something might happen or bring about some result.

High probability = high predictability (event more likely).

Low probability = low predictability (event less likely).

People (and sometimes the media) tend to overestimate the danger of rare events yet underestimate dangers of more common events. People tend to misjudge the relative risks from food safety issues, for example ranking pesticide residues as posing a much greater threat to human health than harmful microorganisms or an unhealthy lifestyle (lack of exercise, poor diet). Yet the statistics show that people are far more likely to die from lifestyle-related diseases such as coronary heart disease and cancers.

In fact, the top causes of death in US, according to the Centers for Disease Control and Prevention, are 1. heart disease, 2. cancer, 3. stroke.

Perceptions and knowledge of risk depend on whether the risk is individual, community, or societal. People tend to overestimate the role of forces inside the individual, such as personality, ability, disposition, and motivation, as causes for human behavior and to underestimate the role of environmental or situational factors, such as the varied opportunites and obstacles that exist for people in different social classes. When applied to whole groups, these attribution errors become the basis for sterotypes.

People tend to assume that if they can control a situation they are safer. We fear dying on airplane more than in a car crash, yet the number of traffic accident fatalities is much higher. Perhaps this is why we fear man-made disasters (radiation) more than natural disasters (tsunami). Trace amounts of radioactive iodine are being detected in rain over the US (CA and VT), but each news story is quick to point out how the levels are low and not a risk, but few offer any comparison to everyday risk.

People are more worried by dramatic but infrequent events than by “boring” risks like slipping on a wet floor. And alarmist, dramatic media coverage contributes to false risk perception. Take, for example, the shark attack. Fueled by Jaws and now Shark Week, our fears of sharks are conditioned. Bees, wasps and snakes are responsible for far more fatalities each year. In the United States the annual risk of death from lightning is 30 times greater than that from shark attack. For most people, any shark-human interaction is likely to occur while swimming or surfing in nearshore waters. From a statistical standpoint the chances of dying in this area are markedly higher from many other causes (such as drowning and cardiac arrest) than from shark attack. Many more people are injured and killed on land while driving to and from the beach than by sharks in the water. Shark attack trauma is also less common than such beach-related injuries as spinal damage, dehydration, jellyfish and stingray stings and sunburn. Indeed, many more sutures are expended on sea shell lacerations of the feet than on shark bites! (International Shark Attack File)

Second example: Avian flu caused 200 deaths in 5 years, with an unlikely possible mutation (from guts of birds to lungs of humans) resulting in a horrendous pandemic, hence alarmist media coverage. But as many as 40,000 people die each year from common seasonal flu. (Wulf 2010).

Risk is the result of events, conditions and situations, called “risk factors.” Where a risk factor has been consistently linked to an event or situation, the factor is said to “cause” death or illness: HIV causes AIDS, asbestos causes mesothelioma, cigarette smoking causes lung cancer.

With these well-proven exceptions, it is difficult to show that any one thing “causes” cancer because cancer doesn’t appear immediately after exposure, providing time for other factors to come into play. Without a direct cause and effect relationship, there are only associations, strong relationships, between a result/disease and an agent/situation, or risk factor. A risk factor is not a guarantee, not a cause, just an association, like high cholesterol and heart disease.

An association does not, by itself, indicate causation. Additional evidence is needed: the event must come before the result, and that other explanations are considered and ruled out. As humans, we seem wired to look for patterns, to want to explain things, hence our tendency to assume causation. But remember: Correlation does not equal causation.

Statistics attempt to quantify risk. But statistics are frequently misused and abused. All research involves choosing what to study and how to study it. Statistics, when applied to data, measure the strength of relationships. The greater the significance, the stronger the relationship, or the less chance that some other factor is important in explaining the relationship.

Where we have considerable knowledge of outcomes, we have an objective probability for a given outcome. In a coin toss, we do not know which face will turn up when it is tossed, but we have objective probabilities of what it will likely be. In complex systems, with many interconnected parts, scientists are often uncertain about the extent and magnitude of the connections. As a result, they have to make judgements about their strength, which is a subjective probability (Stephen Schneider, in Friedman et al.).

The most believable results will have certain characteristics: (Cohn)

Replication: They have been successfully repeated

Reevaluation: They have been tested by more than one method (mathematical technique)

Common attacks on statistics create the impression of numerous errors. Something is wrong with every sample, and pointing this out can begin the unraveling of any argument: the data are outdated, unrepresentative, missing, outliers. The r-squared value x only explains 100-x of the data. The scientist chose the wrong model (linear, non-linear, random, etc.). When additional variables are included, the results become insignificant. Other factors can result in the same effect. Any inconsistency or complication in the data are deliberately obscured or omitted–cast the perception of doubt. (Murray)

With our minds and our worlds filled with uncertainties and our days filled with only 24 hours, we often fall back on judgemental shortcuts, called heuristics, to make sense of things. People reconcile what they see and hear with what they already know from personal experience, friends and family, religious beliefs, political orientation, values, etc.

If someone tells us that things are uncertain, we think that means that the science is muddled. Uncertainty is everywhere, and leads to errors in interpretation. All too often, health benefit and risk statements are presented as if they were authoritative, definitive, and based on clear and compelling evidence. The result? An Illusion of Certainty.

Scientists do not just reduce uncertainty, they actively construct it. They look for problems in their own work by asking questions and probing for gaps and alternative explanations. Uncertainty is different than indeterminacy (when all the parameters of a system and their interactions are not known) and ignorance (when it is not known what is not known). Uncertainty means that the parameters are sufficiently known to make a qualitative judgement or attempt a conclusion; there is no such thing as absolute proof. Doubt (or curiousity or skepticism) is crucial to science (to a scientist, claiming or acknowledging uncertainty maintains an appearance of objectivity) but it also makes science vulnerable to misrepresentation. Uncertainty can appear as controversy, because it is easy to take uncertainties out of context and create the impression that everything is unresolved and thus plant seeds of doubt in the reader’s mind (Oreskes and Conway).

Another contributor to the illusion, as we’ve seen, is the habit of the news media to report research as “news,” presenting research findings out of historical and scientific context as new, very preliminary, and potentially groundbreaking. Reports can celebrate the finding, and downplay uncertainty. The accounts of each new project makes it appear to readers that scientists are much more uncertain than they actually are. Today’s news is easily contradicted by tomorrow’s reports. Other reports may emphasize early differences of opinion among scientists, highlighting uncertainty. Science is portrayed as a triumphant quest for certainty: the answer to a question, the solution to a puzzle, keys to unlock the door to knowledge, clues to a mystery. Often, the public is offered a view of the future in which scientific certaintly returns: “Researchers hope to be able to predict the behavior of hurricanes more precisely”; “By improving their understanding of X, researchers will solve problem Y.” (Zehr and Stocking, both in Friedman et al.)

Watch out for these phrases, or at least think about it before you use them. This is the challenge: how to communicate the ‘so what’ without claiming future certainty?

– Think about the outlet and the audience, and select your topic carefully. If the so what is a stretch, maybe don’t write the story.

– Interview others. A caution: the presence of multiple voices in a media story about emergent science allows the reader to glimpse the degree of consensus, yet it may be difficult for readers to evaluate. Are the uncertainties so great that reasonable people cannot come to a resolution? Is the finding so novel that other scientists simply have no useful expertise? With the Internet, readers can assemble meaning themselves by cobbling together stories about the same topic from a variety of places and times. If you cannot tell who is telling the truth or where the consensus lies, then the best you can do is accurately capture the message and attribute it. Or, you can present an array of viewpoints and let the reader decide (or feel overwhelmed) “This focus on the journalist as a passive transmitter allows us to make accuracy the most important characteristic of a story and often to bypass issues of validity all together…the objectivity norm urges journalists to leave their own analytical skills at home and to concentrate, instead, on conveying what they see and hear…if journalists are normatively limited to reporting rather than interpreting, then audiences are left to sift through the dueling representations of uncertainty themselves” (Friedman et al.).

– Explain changes in certainty or consensus. This requires historical context and knowledge of particular fields, and may be harder for a science generalist than for someone who specializes in certain subjects.

– Look at why people may be promoting or challenging uncertainty. We will look at this issue in more detail in a few weeks. If you say, ‘There is no evidence’, do you mean, ‘There are no studies done on X’, or, ‘There are lots of studies out there, and they show no risk of X causing Y’?

– Watch the use of anecdotes and false “trendsetting”. Anecdotes can be fine examples, but they are usually poor evidence. To a social scientist, what seems like a great interview with printable quotes is a convenience survey of an unrepresentative sample. Vivid anecdotes can interfere with a person’s judgement of risks (Griffin, in Friedman et al.) Make sure your examples are representative.


Best, Joel. 2001. Damned Lies and Statistics. Berkeley: University of California Press.

Best, Joel. 2004. More Damned Lies and Statistics. Berkeley: University of California Press.

Best, Joel. 2005. Lies, calculations and constructions: beyond How to Lie with Statistics. Statistical Science 20 (3):210-214.

Cohn, V. 1989. News and Numbers. Ames, IA: Iowa University Press.

Cope, Lewis. 2006. Understanding and using statistics, pp. 18-25 in A Field Guide for Science Writers, 2nd edition.

Drum, Kevin. 2010. Statistical Zombies.

Friedman, S.F., S. Dunwoody, and C.L. Rogers. 1999. Communicating Uncertainty. Mahwah, NJ:Lawrence Erlbaum Associates.

Gould, Stephen Jay. The Median Isn’t the Message.

Huff, Darrell. 1954. How to Lie with Statistics. New York: W.W. Norton

Monmonier, Mark. 1996. How to Lie with Maps (2nd Ed.) Chicago: The University of Chicago Press.

Monmonier, Mark. 2005. Lying with maps. Statistical Science 20(3):215-222.

Murray, C. 2005. How to accuse the other guy of lying with statistics. Statistical Science 20(3): 239-241.

Niles, Robert.

Oreskes, Naomi, and Erik M. Conway. 2010. Merchants of Doubt. New York: Bloomsbury Press.

Rifkin, Erik, and Edward Bouwer. 2007. The Illusion of Certainty. New York: Springer.

Tufte, Edward R. 1983. The Visual Display of Quantitative Information. Cheshire, CT: Graphics Press.

Tufte, Edward R. 2006. Beautiful Evidence. Cheshire, CT: Graphics Press.

Tufte, Edward R. 1997. Visual Explanations. Cheshire, CT: Graphics Press.