Words and numbers in journalism: How to tell when your story needs data

Update: A more recent version of this material appears in my book, The Curious Journalist’s Guide To Data.

I’m not convinced that journalists are always aware when they should be thinking about numbers. Usually, by training and habit, they are thinking about words. But there are deep relationships between words and numbers in our everyday language, if you stop to think about them.

A quantity is an amount, something that can be compared, measured or counted — in short, a number. It’s an ancient idea, so ancient that it is deeply embedded in every human language. Words like “less” and “every” are obviously quantitative, but so are more complex concepts like “trend” and “significant.” Quantitative thinking starts with recognizing when someone is talking about quantities.

Consider this sentence from the article Anti-Intellectualism is Killing America which appeared in Psychology Today:

In a country where a sitting congressman told a crowd that evolution and the Big Bang are “lies straight from the pit of hell,” where the chairman of a Senate environmental panel brought a snowball into the chamber as evidence that climate change is a hoax, where almost one in three citizens can’t name the vice president, it is beyond dispute that critical thinking has been abandoned as a cultural value.

This is pure cultural critique, and it can be interpreted many different ways. To start with, I don’t know of standard and precise meanings for “critical thinking” and “cultural value.” We could also read this paragraph as a rant, an exaggeration for effect, or an account of the author’s personal experience. Maybe it’s art. But journalism is traditionally understood as “non-fiction,” and there is an empirical and quantitative claim at the heart of this language.

Continue reading Words and numbers in journalism: How to tell when your story needs data

We Were Wrong About Giraffes

I was told in grade school that the giraffe’s neck evolved to be long because taller giraffes could reach more tasty tree leaves in times of drought. It’s a lovely example of natural selection, and also completely wrong, as I discovered when researching an edit to the Wikipedia article. Eventually, someone just went and checked: it turns out that during times of drought or food scarcity, giraffes eat from low bushes.

There is an important lesson here about what it means to “explain something.”

giraffe-1

Rudyard Kipling wrote a children’s book of myths about the origins of animals titled Just-So Stories. In it he explains the origin of the elephant’s trunk, how the camel got his hump, and where the  leopard’s spots came from (they were drawn by an Ethiopian from the leftover black of his own dark skin, so that the leopard would better blend into the background when they hunted zebra together.) Clearly, making sense is not the criterion for truth. It’s very easy to forget this, when someone gives you a complex explanation and you get that “aha! I understand” feeling.  Human beings constantly confuse congruence with truth.

Sensible and false explanations are such a problem in science that the term “just-so story” has come to refer to any sort of explanation that fits the facts, but cannot be verified. Scientific theories are supposed to differ from literary criticism and other forms of creative writing by demanding explanations that are true. This means testing them against reality.

A crucial point here:  you can’t test a theory against the same facts that you used to come up with the theory to begin with. Of course a theory is going to fit the facts that inspired it! Instead, a theory — an explanation of something — needs to predict things that haven’t been observed yet. Prediction is the essence of science; it is the ability to say what will happen before it happens that makes it possible to “design” a bicycle rather than just gluing random objects together until they roll. If our aim is to come up with a true theory about evolution, we need to use the length of  the giraffe’s neck to make predictions about something else, something we can go check (repeatedly, if we are serious about testing the theory.)

This seemingly philosophical notion is incredibly useful for spotting subtle bullshit that sounds like science.

Consider, for example, the trial of a vitamin for preventing the common cold. Let’s say it’s even a controlled trial. One hundred volunteers are given Vitamin Z daily, while another hundred are (unknowingly) given a placebo. At the end of the study, the Vitamin Z group had the same number of colds. But, the researchers discover as they analyze the data, they had fewer headaches. Does this mean Vitamin Z prevents headaches? Not necessarily, because the theory “Vitamin Z prevents headaches” was formulated  by noticing a pattern, any pattern, then making up a story about how that pattern came to be. That doesn’t make the story true. And there will always be patterns. If the volunteers can suffer from hundreds of different ailments, then by sheer dumb chance the Vitamin Z group will be found to suffer from less of at least one of them. (Applied to controlled experiments, this notion can be made mathematically precise, by the way. See post-hoc analysis.)

Put another way, if you keep turning over rocks you will eventually find something. The whole point of a theory — an explanation, a model, a statement of the causal relationships of reality — is to say what you will find before the rock is turned over. Otherwise you only have a story that fits the facts, a just-so story.

I have found just-so stories to be most common in alternative medicine, economics, and evolutionary explanations of human behavior. If nothing testable has been predicted, then nothing has been “explained.”

Fight Global Warming on Your Desktop

climateprediction

Or at least help us to understand it. Climateprediction.net is a large-scale scientific computing experiment, relying on individual computer users who donate their computer time for the simulation of tens of thousands of global warming scenarios. This is important because, lacking other Earths to experiment with, computer simulations are really the only way we can validate our existing models of climate change — and then predict the future with models we think are accurate.

The climateprediction.net project comprises three separate experiments – one to explore the model we are using, the second to see how well the models replicate past climate and the third to finally produce a forecast for 21st century climate. Each model that we distribute will be used for all three experiments.

Built upon the BOINC scientific computing framework oriignally developed for the groundbreaking SETI@Home project, Climateprediction.net relies upon hundreds of thousands of volunteer users who donate their spare computer time. All of these machines together are effectively one of the largest supercomputers in the world, and this allows previously impossible scientific studies. The Climateprediction.net scientific team can run not just one or a few climate prediction simulations, but hundreds of thousands. One study performed this way was the Seasonal Attribution Experiment:

Continue reading Fight Global Warming on Your Desktop

FMRI “Mind Reading” Doesn’t Yet Threaten Humanity

visual-image-reconstruction-from-fmri

It is now possible to see what a person is looking at by scanning their brain. The technique, published last November by a team of Japanese neuroscientists, uses FMRI to reconstruct a digital image of the picture entering the eye, albeit at very low resolution and only after hundreds of training runs. Still, it’s an awesome development, and many articles covering this research have called it “mind reading” (1, 2, 3, 4, 5). But it really isn’t, and it’s fun to explore what real “mind reading” would imply.

When I hear “mind reading” I want psychic abilities. I want to be able to know what number you’re thinking of, where you were on the night of March 4th, and what you actually think of my souffle. This is the sort of technology that could be badly misused, as the comments on one blog note:

Am I the only one finding this DEEPLY disturbing? It opens the doors to some of the scariest 1984-style total-control future predictions. Imagine you can’t hide your f#&%!ng MIND!

Fortunately, we’re not there yet. Morover, if we did have the technology to read minds, we’d have much bigger societal issues than privacy to deal with. The existence of “mind reading machines” would imply that we possessed good formal models of the human mind, and that is a can of worms.

Continue reading FMRI “Mind Reading” Doesn’t Yet Threaten Humanity

What Can We Learn From the Network Structure of Wikipedia Authors?

telephone-tapping-network-structure

The edit network for “telephone tapping” shows a bipartite structure, indicating that the topic is controversial (image from Brandes et al.)

An interesting new paper defines the “edit network” of a Wikipedia article by drawing edges to indicate that one person has deleted or restored text written by another. While it’s always fun to look at pictures, the surprise here is that we can verify that the resulting graph structure really does tell us something useful about the article. In this study, articles with a more “bipolar” edit network — meaning that the authors split into basically two camps who routinely undid each other’s edits — were also much more likely to appear on a manually-maintained list of controversial pages.

Although there has been previous work on network mapping of Wikipedia in particular (and of course volumes of work on social networks in general) I find this paper interesting because it tries very carefully to understand whether the pictures mean anything. Like all science, what you find depends on where you look, and the practitioner of network analysis has an absurd amount of freedom to define what a “node” is, what an “edge” is, and how the resulting graph is visually laid out (since the point of a map is a visual representation, it’s very important that graphical properties such as distance, size, color, etc. have the right sort of metaphorical relationships to the more abstract properties we are trying to understand.)  

Continue reading What Can We Learn From the Network Structure of Wikipedia Authors?

Are They Right?

I’ve been reading StopTheACLU.com, because I want to get into their heads, because I want to avoid the classic mistake of intellectual isolation, and because I want to be challenged. Sure, they’re weirdos, but that doesn’t mean they don’t make sense. But there’s at least one thing in the StopTheACLU worldview that I find very hard to method-act: in their universe, global warming is a myth.

Okay, but how did I end up on this side and not that side?

Continue reading Are They Right?

Science Writing is Hard

Science is sometimes really tricky, which makes writing about it even trickier. No real experiment exists apart from a huge background of assumptions, abstractions, caveats and complexities;  the writer’s job is to find a strong narrative that is understandable with little or no prior knowledge, scans well, and catches the reader’s attention.

Recent research on physiological differences between liberal and conservative voters seems like a dream come true if you’re in need of a catchy press release, like this one from the National Science Foundation. I read the actual paper, and it says that people who answer more conservatively on a questionnaire about their politics tend also to have more pronounced “fight-or-flight” reactions to disturbing or surprising stimuli, as measured by skin conductance and startle response.

The press release tells a different story, and I believe that the NSF science writer told the wrong story. I attribute this partially to the politics of publicity, but mostly to the fact that science is actually very subtle, and hard to summarize.

Continue reading Science Writing is Hard

Access to Knowledge and the Banality of Evil

It pisses me off that there is a huge body of very important information that most people can’t get at. I’m not talking about books, the poor paper things, but the world’s academic and scientific journals, which are already online.

Most people don’t even know that the world’s academic journals exist, but this is the master record, the huge source that all those science blogs and mis-representative popular articles draw from. These research journals are the collective output of every professional researcher in the world, in all subjects — only you’re not allowed to read them.

Continue reading Access to Knowledge and the Banality of Evil

Medicine is the Killer App For Technology

I’ve met quite a few people who feel that civilization was a mistake. Technology in particular, they say, is bad in some way. If they’re an anarcho-primitivist theorist, they’ll tell you it’s alienating: it creates hierarchies, produces psychological illusions of scarcity, and turns us into little more than specialized insects. If they’re less geeky and more hippie, they’ll just expound on how happy they were living in that rural Indian village, how spiritual that life was, how much more natural a world without technology would be.

In the bright Nepali sunshine, sipping chai in a tourist cafe overlooking the lake, I found I could not agree, no matter how cute the dreadlocked girl sitting across from me. I see a lot of idealism and projection in her arguments. I also see an iPod in her bag. But neither could I come up with a concrete reason to insist that technology is fundamentally good, that the human race should invest as heavily in technology as it has. I admit that I really enjoy both the intellectual playground of technology and the fruits it brings, but that’s no way to form a moral imperative.

Until Ethiopia. I was working on a trachoma epidemiology study. This is an ancient, simple disease, and so fragile that the merest hint of civilization will destroy it — we’re not quite sure why yet. It could be antibiotics used for other things wipes it out, it could be that just washing your hands daily in clean water prevents its spread. But if left untreated long enough, this feeble disease will make you blind.

I had the cliché moment. I hiked out across the roadless wilderness to that idealized little village, that tiny traditional portion of the way we used to live. The simple folk gathered round us, gazing strangely at our white skin and synthetic fabrics. In turn we stared at their traditional cotton garments and coarse shiny jewelry, artifacts of a society that makes everything with its own hands. We stood a moment in that field, contemplating one another across vast distances of education and context. Then I looked into the scarred corneas of a blind young man and felt suddenly: this sucks. This man cannot see, for no reason at all. Extremely simple medicine could have prevented that.

It’s one of those moments when you realize that you’re not okay with the world as it is.

Medicine is good because health is good. I see no other way to draw this conclusion. And medicine is technological. Antibiotics are in no sense natural, x-rays and heart transplants less so. Medicine is the moral justification for continued technological development and dissemenation. It’s the killer app for technology, because it’s not just medical technology that must be known: modern medicine requires an entire technological infrastructure to design and manufacture its many, many inputs. Computers. Polymers. Superconducting magnets. Refrigerators to make the ice to keep cold our collected samples, and enzymes to do the PCR to detect the trachoma DNA, mathematics to do the statistical analysis to determine if our mass antibiotic distribution is actually denting the epidemic. It takes a world to raise a hospital.

That’s the moral reason for continued technological development. That blind man. Go tell his mother that we’d all be happier as hunter-gatherers.

Of course, that’s not why we actually will continue to develop our technology.

In the late afternoon sunlight I lounged against a tree, waiting for the last few villagers to show up so we could test them. They had fed us some (traditional, natural, idealized) beer, and I was sleepy and idle. I extracted my MP3 key from my kit and put the headphones in, leaned back to something relaxed. A kid came up to me, looking expectantly. He must have been about twelve.

“MP3 player?” he said.

“Yeah,” I replied.

“How many gigabytes?” he asked. Then: “I want one.”

I find it hard to disagree with him.