The challenges of distributed investigative journalism

One of the clearest ideas to emerge from the excitement around the new media transformation of journalism is the notion that the audience should participate in the process. This two way street has been nicely described by Guardian editor Alan Rusbridger as the “mutualization of journalism.” But how to do it? What’s missing from what has been tried so far? Despite many experiments, the territory is still so unexplored that it’s almost impossible to say what will work without trying it. With that caveat, here are some more or less wild speculations about the sorts of tools that “open” investigative journalism might need to work.

There have been many collaborative journalism projects, from the Huffington Post’s landmark “Off The Bus” election campaign coverage to the BBC’s sophisticated “user-generated content hub” to CNN’s iReport. One lesson in all of this is that form matters. Take the lowly comment section. News site owners have long complained, often with good reason, that comments are a mess of trolls and flame wars. But the prompt is supremely important in asking for online collaboration. Do journalists really want “comments”? Or do they want error corrections, smart additions, leads, and evidence that furthers the story?

Which leads me to investigative reporting. It’s considered a specialty within professional journalism, dedicated to getting answers to difficult questions — often answers that are embarrassing to those in power. I don’t claim to be very good at journalistic investigations, but I’ve done enough reporting to understand the basics. Investigative reporting is as much about convincing a source to talk as it is about filing a FOIA request, or running a statistical analysis on a government data feed. At heart, it seems to be a process of assembling widely dispersed pieces of information — connecting the distributed dots. Sounds like a perfect opportunity for collaborative work. How could we support that?

A system for tracking what’s already known
Reporters keep notes. They have files. They write down what was said in conversations, or make recordings. They collect documents. All of this material is typically somewhere on or around a reporter’s desk or sitting on their computer. That means it’s not online, which means no one else can build on it. Even within the same newsroom, notes and source materials are seldom shared. We have long had customer relationship management systems that track every contact with a customer. Why not a “source relationship management” system that tracks every contact with every source by every reporter in the newsroom? Ideally, such a system would be integrated into the reporter’s communications tools: when I make a phone call and hit record (after getting the source’s permission of course) that recording could be automatically entered into system’s files, stamped by time, date, and source, then transcribed by machine to make it searchable. Primary documents would be also be filed in the system, along with notes and links and comments from everyone working on the story. The entire story of the story could be in one place.

There have been experiments in collaborative journalistic files, such as OpenFile.ca or even good local wikis. But I don’t believe there has yet been a major professional newsroom which operated with open files. For that matter, I am not aware of this type of information filing system in existence anywhere in journalism, though I suspect it’s what intelligence services do.

Public verification processes
Journalism aims to be “true,” a goal which requires elaborate verification processes. But in every newsroom I’ve worked with, essential parts of the verification standards are not codified. “At least two sources” is a common maxim, but are there any situations where one is enough? For that matter, who counts as a definitive source? When is a conflict of interest serious enough to disqualify what someone is telling you? The answers to these questions and many more are a matter of professional practice and culture. This is confusing enough for a new reporter joining staff, let alone outsiders who might want to help.

Verification is necessarily contextual. Both the costs of verification and the consequences of being in error vary widely with circumstance, so journalists must make situational choices. How sure do we have to be before we say something is true, how do we measure that certainty, and what would it take to be more sure? Until this sort of nuanced guidance is made public, and the public is provided with experienced support to encourage good calls in complex or borderline cases, it won’t be possible to bring enthusiastic outsiders fully into the reporting process. They simply won’t know what’s expected of them, to be able to participate in the the production of a product to certain standards. Those standards depend on what accuracy/cost/speed tradeoffs best serve the communities that a newsroom writes for, which means that there is audience input here too.

What is secret, or, who gets to participate?
Traditionally, a big investigative story is kept completely secret until it’s published. This is shifting, as some journalists begin to view investigation as more of a process than a product. However, you may not want the subject of an investigation to know what you already know. It might, for example, make your interview with a bank CEO tricky if they know you’ve already got the goods on them from a former employee. There are also off-the-record interviews, embargoed material, documents which cannot legally be published, and a multitude of concerns around the privacy rights of individuals. I agree with Jay Rosen when he says that “everything a journalist learns that he cannot tell the public alienates him from the public,” but that doesn’t mean that complete openness is the solution in all cases. There are complex tradeoffs here.

So access to at least some files must be controlled, for at least some period of time. Ok then — who gets to see what, when? Is there a private section that only staff can see and a public section for everyone else? Or, what about opening some files up to trusted outsiders? That might be a powerful way to extend investigations outside the boundaries of the newsroom, but it brings in all the classic problems of distributed trust, and more generally, all the issues of “membership” in online communities. I can’t say I know any good answers. But because the open flow of information can be so dramatically productive, I’d prefer to start open and close down only where needed. In other words, probably the fastest way to learn what truly needs to be secret is to blow a few investigations when someone says something they shouldn’t have, then design processes and policies to minimize those failure modes.

There is also a professional cultural shift required here, towards open collaboration. Newsrooms don’t like to get scooped. Fair enough, but my answer to this is to ask what’s more important: being first, or collectively getting as much journalism done as possible?

Safe places for dangerous hypotheses
Investigative journalism requires speculation. “What if?” the reporter must say, then go looking for evidence. (And equally, “what if not?” so as not to fall prey to confirmation bias.) Unfortunately, “what if the district attorney is a child molester?” is not a question that most news organizations can tolerate on their web site. In the worst case, the news organization could be sued for libel. How can we make a safe and civil space — both legally and culturally — for following speculative trains of thought about the wrongdoings of the powerful? One idea, which is probably a good idea for many reasons, is to have very explicit marking of what material is considered “confirmed,” “vetted,” “verified,” etc. and what material is not. For example, iReport has such an endorsement system. A report marked “verified” would of course have been vetted according to the public verification process. In the US, that marking plus CDA section 230 might solve the legal issues.

A proposed design goal: maximum amplification of staff effort
There are very many possible stories, and very few paid journalists. The massive amplification of staff effort that community involvement can provide may be our only hope for getting the quantity and quality of journalism that we want. Consider, for example, Wikipedia. With a paid staff of about 35 they produce millions of near-real time topic pages in dozens of languages.

But this is also about the usability of the social software designed to facilitate collaborative investigations. We’ll know we have the design right when lots of people want to use it. Also: just how much and what types of journalism could volunteers produce collaboratively? To find out, we could try to get the audience to scale faster than newsroom staff size. To make that happen, communities of all descriptions would need to find the newsroom’s public interface a useful tool for uncovering new information about themselves even when very little staff time is available to help them. Perhaps the best way to design a platform for collaborative investigation would be to imagine it as encouraging and coordinating as many people as possible in the production of journalism in the broader society, with as few full time staff as possible. These staff would be experts in community management and information curation. I don’t believe that all types of journalism can be produced this way or that anything like a majority of people will contribute to the process of journalism. Likely, only a few percent will. But helping the audience to inform itself on the topics of its choice on a mass scale sounds like civic empowerment to me, which I believe to be a fundamental goal of journalism.

What’s the point of social news?

According to Facebook, social news seems to be mostly about knowing what all my friends are reading. I’m not so sure. But I think there really is something to the idea of “social news” for journalism, and for journalism product design.

I take “social” to mean “interacting with other people.” That’s a fundamental technical possibility of digital media, as basic to the internet as moving pictures are to television. I’m not sure that anyone really knows yet what to do with that possibility, but happily there are already at least two very well-developed uses. Maybe social news isn’t about “friends” at all, but about filtering and news-gathering.

Twitter is really a filter
I get most of both my general and special interest news from Twitter. I rarely go to the home page of a news site, or use a news app. It’s not the tweets themselves that are informative, but the links within them to articles posted elsewhere. I follow a large set of people with varied interests, and some of them work for news organizations, but most do not. My twitter feed is faster, more diverse, and available across more platforms (all of them) than any one news organization’s output.

This doesn’t mean that Twitter is a perfect news delivery system, but to me it’s proven better than just about anything else at getting me the news mix that I want, and keeping me interested in the world at large. (Admittedly, I follow people I’ve met in other countries, so yeah, travel is way better than Twitter for that.) I am not alone in this opinion. The structure of follower relationships among Twitter users suggests that it’s more of a news network than a social network.

The usefulness of Twitter for news has a lot to do with certain basic design choices. First, a tweet is really as short as you can get and still communicate a complete concept, so it’s basically an extended headline. Second, Twitter differs from Facebook in that relationships can be unidirectional: I don’t need anyone’s permission to follow them, and they may not know or care that I do. Following someone on Twitter also differs from following a blog via RSS because most tweets refer to someone else’s work through a link — Twitter is more about re-publishing than publishing. Retweets also include the name of the original tweeter, which enables discovery of interesting new curators.

Filtering is much more valuable than it used to be, in this era of information overload, and these properties make Twitter an excellent filtering system. There are several news products based almost entirely on displaying links tweeted by the people you follow, such as The Twitter Tim.es and Flipboard. The medium that Twitter invented — global public short messaging with links — has already been endlessly replicated and will be with us forever.

There is a sense in which news organizations have always seen filtering as a big part of their value. One of the duties of the professional editor is to decide what you need to see. But at least one thing has upset that model irretrievably: the internet is not a broadcast medium. While each person reads an identical copy of the Times and watches an identical CNN broadcast, there’s no reason my internet has to look the same as your internet. A small team of human editors can’t personalize the headlines for every reader, so that leaves algorithmic filtering, such as Google News’ personalization features, or social filtering, such as Twitter.

The point is, there’s probably something to learn from how Twitter uses social relationships to route information. As the Nieman Journalism Lab said: “social news isn’t about the people you know so much as the people with whom you share interests.” To put this in terms of the product I wish I had: when I use your news product, I want to be able to follow the recommended reading of other members of the audience, if they so allow. Also, can I follow a particular reporter? And does your product integrate with the other methods I already use for getting information, so I don’t have to choose?

Social networks are great for reporting
Audience-journalist collaboration, blah blah blah. If the idea that professionals are no longer the only players in news is new to you, see blogging and Wikipedia. But a news organization probably has to look at this from a different angle. For me, the core idea of social news-gathering is that the audience is, or could be, an extension of the news organization’s source network.

Hopefully, a newsroom knows about interesting developments before anyone else, and then verifies and publicizes them, but that’s getting near impossible when anyone can publish, and when virality can amplify primary sources without the involvement of a media organization. We don’t know yet very much about collective news-gathering, but there are promising directions. It seems like maybe there are two broad categories of breaking news: public events that anyone could have witnessed, and private events initially known only to privileged observers.

Social media is now routinely used to augment reporting of public events. There are entire units in news organizations dedicated to getting stories from the audience, often under the awkward rubric of “user-generated content.” But why sift for events online when you can give your audience the tools to give you the story directly? Right now if I see a plane land in a river, I tweet it. Wouldn’t a news organization prefer that I send my eye-witness photo to the UGC editor instead? To this end, several mobile news apps include the ability to submit pictures. CNN’s iReport app and website is probably the best developed of these. Ideally, I could send that breaking news tweet to the newsroom and to my friends at the same time, within the same application.

Fast reporting of private events has always depended on having the right sources. A well established source may call the reporter or send an email when something newsworthy happens. Someone with a much looser connection to the organization may not, and perhaps this is an opportunity for social news tools. When someone knows something — or can talk about something — you want them to contact the newsroom first. The potential of this weak-tie news sourcing approach hasn’t really been studied, to my knowledge, but I imagine that it would require, at minimum, a trusted brand, an easily-reachable editorial staff, and frictionless communication tools. If it’s easier just to tweet or blog the news, the source will.

There are several other good examples of social news-gathering, on the theme of asking your audience for help. Crowdsourcing is usually thought of as the recruitment of many unspecialized helpers, as the Guardian did with its MP expenses project. But the Guardian also reached out to its audience to find that one specialist attorney who could unravel the mystery of Tony Blair’s tax returns. Hopefully the specialists a newsroom needs to consult are already among the audience, and they will see the call for experts when a reporter sends one out. For that matter, a smart and engaged audience can correct you quickly when you are wrong. Nothing says “we care about accuracy” like a fact check box on every story.

But is it journalism?
Yes, absolutely. The job of journalism is to collect accurate information on an ongoing basis and ensure that the audience for each story learns about that story. Any way you can deliver that service is fair game. People depend on each other for the news all the time, so journalists better get in those conversations.