The challenges of distributed investigative journalism

One of the clearest ideas to emerge from the excitement around the new media transformation of journalism is the notion that the audience should participate in the process. This two way street has been nicely described by Guardian editor Alan Rusbridger as the “mutualization of journalism.” But how to do it? What’s missing from what has been tried so far? Despite many experiments, the territory is still so unexplored that it’s almost impossible to say what will work without trying it. With that caveat, here are some more or less wild speculations about the sorts of tools that “open” investigative journalism might need to work.

There have been many collaborative journalism projects, from the Huffington Post’s landmark “Off The Bus” election campaign coverage to the BBC’s sophisticated “user-generated content hub” to CNN’s iReport. One lesson in all of this is that form matters. Take the lowly comment section. News site owners have long complained, often with good reason, that comments are a mess of trolls and flame wars. But the prompt is supremely important in asking for online collaboration. Do journalists really want “comments”? Or do they want error corrections, smart additions, leads, and evidence that furthers the story?

Which leads me to investigative reporting. It’s considered a specialty within professional journalism, dedicated to getting answers to difficult questions — often answers that are embarrassing to those in power. I don’t claim to be very good at journalistic investigations, but I’ve done enough reporting to understand the basics. Investigative reporting is as much about convincing a source to talk as it is about filing a FOIA request, or running a statistical analysis on a government data feed. At heart, it seems to be a process of assembling widely dispersed pieces of information — connecting the distributed dots. Sounds like a perfect opportunity for collaborative work. How could we support that?

A system for tracking what’s already known
Reporters keep notes. They have files. They write down what was said in conversations, or make recordings. They collect documents. All of this material is typically somewhere on or around a reporter’s desk or sitting on their computer. That means it’s not online, which means no one else can build on it. Even within the same newsroom, notes and source materials are seldom shared. We have long had customer relationship management systems that track every contact with a customer. Why not a “source relationship management” system that tracks every contact with every source by every reporter in the newsroom? Ideally, such a system would be integrated into the reporter’s communications tools: when I make a phone call and hit record (after getting the source’s permission of course) that recording could be automatically entered into system’s files, stamped by time, date, and source, then transcribed by machine to make it searchable. Primary documents would be also be filed in the system, along with notes and links and comments from everyone working on the story. The entire story of the story could be in one place.

There have been experiments in collaborative journalistic files, such as OpenFile.ca or even good local wikis. But I don’t believe there has yet been a major professional newsroom which operated with open files. For that matter, I am not aware of this type of information filing system in existence anywhere in journalism, though I suspect it’s what intelligence services do.

Public verification processes
Journalism aims to be “true,” a goal which requires elaborate verification processes. But in every newsroom I’ve worked with, essential parts of the verification standards are not codified. “At least two sources” is a common maxim, but are there any situations where one is enough? For that matter, who counts as a definitive source? When is a conflict of interest serious enough to disqualify what someone is telling you? The answers to these questions and many more are a matter of professional practice and culture. This is confusing enough for a new reporter joining staff, let alone outsiders who might want to help.

Verification is necessarily contextual. Both the costs of verification and the consequences of being in error vary widely with circumstance, so journalists must make situational choices. How sure do we have to be before we say something is true, how do we measure that certainty, and what would it take to be more sure? Until this sort of nuanced guidance is made public, and the public is provided with experienced support to encourage good calls in complex or borderline cases, it won’t be possible to bring enthusiastic outsiders fully into the reporting process. They simply won’t know what’s expected of them, to be able to participate in the the production of a product to certain standards. Those standards depend on what accuracy/cost/speed tradeoffs best serve the communities that a newsroom writes for, which means that there is audience input here too.

What is secret, or, who gets to participate?
Traditionally, a big investigative story is kept completely secret until it’s published. This is shifting, as some journalists begin to view investigation as more of a process than a product. However, you may not want the subject of an investigation to know what you already know. It might, for example, make your interview with a bank CEO tricky if they know you’ve already got the goods on them from a former employee. There are also off-the-record interviews, embargoed material, documents which cannot legally be published, and a multitude of concerns around the privacy rights of individuals. I agree with Jay Rosen when he says that “everything a journalist learns that he cannot tell the public alienates him from the public,” but that doesn’t mean that complete openness is the solution in all cases. There are complex tradeoffs here.

So access to at least some files must be controlled, for at least some period of time. Ok then — who gets to see what, when? Is there a private section that only staff can see and a public section for everyone else? Or, what about opening some files up to trusted outsiders? That might be a powerful way to extend investigations outside the boundaries of the newsroom, but it brings in all the classic problems of distributed trust, and more generally, all the issues of “membership” in online communities. I can’t say I know any good answers. But because the open flow of information can be so dramatically productive, I’d prefer to start open and close down only where needed. In other words, probably the fastest way to learn what truly needs to be secret is to blow a few investigations when someone says something they shouldn’t have, then design processes and policies to minimize those failure modes.

There is also a professional cultural shift required here, towards open collaboration. Newsrooms don’t like to get scooped. Fair enough, but my answer to this is to ask what’s more important: being first, or collectively getting as much journalism done as possible?

Safe places for dangerous hypotheses
Investigative journalism requires speculation. “What if?” the reporter must say, then go looking for evidence. (And equally, “what if not?” so as not to fall prey to confirmation bias.) Unfortunately, “what if the district attorney is a child molester?” is not a question that most news organizations can tolerate on their web site. In the worst case, the news organization could be sued for libel. How can we make a safe and civil space — both legally and culturally — for following speculative trains of thought about the wrongdoings of the powerful? One idea, which is probably a good idea for many reasons, is to have very explicit marking of what material is considered “confirmed,” “vetted,” “verified,” etc. and what material is not. For example, iReport has such an endorsement system. A report marked “verified” would of course have been vetted according to the public verification process. In the US, that marking plus CDA section 230 might solve the legal issues.

A proposed design goal: maximum amplification of staff effort
There are very many possible stories, and very few paid journalists. The massive amplification of staff effort that community involvement can provide may be our only hope for getting the quantity and quality of journalism that we want. Consider, for example, Wikipedia. With a paid staff of about 35 they produce millions of near-real time topic pages in dozens of languages.

But this is also about the usability of the social software designed to facilitate collaborative investigations. We’ll know we have the design right when lots of people want to use it. Also: just how much and what types of journalism could volunteers produce collaboratively? To find out, we could try to get the audience to scale faster than newsroom staff size. To make that happen, communities of all descriptions would need to find the newsroom’s public interface a useful tool for uncovering new information about themselves even when very little staff time is available to help them. Perhaps the best way to design a platform for collaborative investigation would be to imagine it as encouraging and coordinating as many people as possible in the production of journalism in the broader society, with as few full time staff as possible. These staff would be experts in community management and information curation. I don’t believe that all types of journalism can be produced this way or that anything like a majority of people will contribute to the process of journalism. Likely, only a few percent will. But helping the audience to inform itself on the topics of its choice on a mass scale sounds like civic empowerment to me, which I believe to be a fundamental goal of journalism.