The new structure of stories: a reading list

Different medium, different story form. It’s clear that each new technology — photography, radio, television — has brought with it different ways of constructing a narrative, and different ways those narratives fit into the audience’s lives. Online media are no different, and the step between analog and digital is in many ways much larger than any that has come before, because the internet connects the audience to each other as well as to the newsroom.

Here’s my attempt at a little reading list of recent work on the structure of stories. Pointers to additional material are welcome!

The Big Picture
What’s wrong with the article, anyway? Jeff Jarvis explores this question in “The article as luxury or by-product.” This essay provoked lots of interesting reaction, such as from Mathew Ingram.

So how do we understand ways to fix this? Vadim Lavrusik takes a shot at this question and comes up with the building blocks of context, social, personalization, mobile, participation. It’s a good taxonomy so I’m going to partially steal it for this post.

At the  NYC Hacks Hackers meetup last week, Trei Brundrett took us through SB Nation’s “story stream” product, and Gideon Lichfield of The Economist gave a really nice run through of the “news thing” concept that was fleshed-out collaboratively last month at Spark Camp by Gideon, Matt Thompson, and a room full of others. Very meaty, detailed, up-to-the-minute discussions, for serious news nerds. Video here.

Context
You just can’t do better than Matt Thompson’s “An antidote for web overload.” I also recommend Matt’s wonderful “The three key parts of news stories that are usually missing.” Another good primer is Jay Rosen’s “Future of Context” talk at SXSW.

See also my “Short doesn’t mean shallow,” about hyperlinks as a contextual storytelling form.

For an example of these ideas in action, consider Mother Jone’s Egypt Explainer page — which Gideon Lichfield critiques in the video linked above.

Social
What does it mean for news to be social anyway? Henry Jenkins argues for the power of “spreadable media” as a new distribution model.

In “What’s the point of social news?” I discuss two areas where social media have a huge impact on news: the use of social networks as a personalized filter, and distributed sourcing of tips and material.

Personalization
News is now personalized by a variety of filters, both social and algorithmic. Eli Pariser argues this puts us in a “filter bubble.” He may be right, but research by Pew and others [1,2] consistently shows that when users are allowed to recommend any URL to one other, the “news agenda” that the audience constructs has only 5%-30% of stories in common with mainstream media.

comparison of questions asked of the White House by a Twitter audience vs. by journalists shows a remarkable difference in focus.  All of this this suggests to me that whatever else is happening, personalization meets an audience need that traditional broadcast journalism does not.

Besides, maybe not every person needs to see every story, if we view the goal of journalism as empowerment.

Participation
What do we know and what don’t we know about public participation in the journalism project, and what has worked or failed so far? Jay Rosen has an invaluable summary.

I also recommend the work of Amanda Michel as someone who does crowd-based reporting every day, and my own speculations on distributed investigative reporting.

Structured information
Is the product of journalism narratives or (potentially machine-readable) facts? Adrian Holovaty seems to be the first to have explored this in his 2006 essay “A fundamental way newspaper websites need to change.” This mantle has been more recently taken up by Stijn Debrouwere in his “Information Architecture for News Websites” series, and in Reg Chua’s “structured journalism,” and in a wide-ranging series at Xark.

There are close connections here to semantic web efforts, and occasional overlap between the semweb and journalism communities.

Mobile
I haven’t seen any truly good roundup posts on what mobile will mean for news story form, but there are some bits and pieces. Mobile is by-definition location aware, and Mathew Ingram examines how location is well used by Google News (and not by newsrooms.)

Meanwhile, Zach Seward of the Wall Street Journal has done some interesting news-related things with Foursquare.

Real time
Emily Bell, formerly of the Guardian and now at Columbia, explains why every news organization needs to be real-time.

For a granular look at how informations spreads in real time, consider Mathew Ingram on “Osama bin Laden and the new ecosystem of news.” For a case study of real-time mobile reporting, we have Brian Stelter’s “What I learned in Joplin.”

 

The editorial search engine

It’s impossible to build a computer system that helps people find or filter information without at some point making editorial judgements. That’s because search and collaborative filtering algorithms embody human judgement about what is important to know. I’ve been pointing this out for years, and it seems particularly relevant to the journalism profession today as it grapples with the digital medium. It’s this observation which is the bridge between the front page and the search results page, and it suggests a new generation of digital news products that are far more useful than just online translations of a newspaper.

It’s easy to understand where human judgement enters into information filtering algorithms, if you think about how such things are built. At some point a programmer writes some code for, say, a search engine, and tests it by looking at the output on a variety of different queries. Are the results good? In what way do they fall short of the social goals of the software? How should the code be changed? It’s not possible to write a search engine without a strong concept of what “good” results are, and that is an editorial judgement.

I bring this up now for two reasons. One is an ongoing, active debate over “news applications” — small programs designed with journalistic intent — and their role in journalism. Meanwhile, for several years Google’s public language has been slowly shifting from “our search results are objective” to “our search results represent our opinion.” The transition seems to have been completed a few weeks ago, when Matt Cutts spoke to Wired about Google’s new page ranking algorithm:

In some sense when people come to Google, that’s exactly what they’re asking for — our editorial judgment. They’re expressed via algorithms. When someone comes to Google, the only way to be neutral is either to randomize the links or to do it alphabetically.

There it is, from the mouth of the bot. “Our editorial judgment” is “expressed via algorithms.” Google is saying that they have and employ editorial judgement, and that they write algorithms to embody it. They use algorithms instead of hand-curated lists of links, which was Yahoo’s failed web navigation strategy of the late 1990s, because manual curation doesn’t scale to whole-web sizes and can’t be personalized. Yet hand selection of articles is what human editors do every day in assembling the front page. It is valuable, but can’t fulfill every need.

Informing people takes more than reporting
Like a web search engine, journalism is about getting people the accurate information they need or want. But professional journalism is built upon pre-digital institutions and economic models, and newsrooms are geared around content creation, not getting people information. The distinction is important, and journalism’s lack of attention to information filtering and organization seems like a big omission, an omission that explains why technology companies have become powerful players in news.

I don’t mean to suggest that going out and getting the story — aka “reporting” — isn’t important. Obviously, someone has to provide the original report that then ricochets through the web via social media, links, and endless reblogging. Further, there is evidence that very few people do original reporting. Last year I counted the percentage of news outlets did their own reporting on one big story, and found that only 13 of 121 stories listed on Google News did not simply copy information found elsewhere. A contemporaneous Pew study of the news ecosystem of Baltimore found that most reporting was still done by print newspapers, with very little contributed by “new media,” though this study has been criticized for a number of potentially serious category problems. I’ve also repeatedly experienced the power that a single original report can have, as when I made a few phone calls to discover that Jurgen Habermas is not on Twitter, or worked with AP colleagues to get the first confirmation from network operators that Egypt had dropped off the internet. Working in a newsroom, obsessively watching the news propagate through the web, I see this every day: it’s amazing how few people actually pump original reports into the ecosystem.

But reporting isn’t everything. It’s not nearly enough. Reporting is just one part of ensuring that important public information is available, findable, and known. This is where journalism can learn something from search engines, because I suspect what we really want is a hybrid of human and algorithmic judgement.

As conceived in the pre-digital era, news is a non-personalized, non-interactive stream of updates about a small number of local or global stories. The first and most obvious departure from this model would be the ability to search within a news product for particular stories of interest. But the search function on most news websites is terrible, and mostly fails at the core task of helping people find the best stories about a topic of interest. If you doubt this, try going to your favorite news site and searching for that good story that you read there last month. Partially this is technical neglect. But at root this problem is about newsroom culture: the primary product is seen to be getting the news out, not helping people find what is there. (Also, professional journalism is really bad at linking between stories, and most news orgs don’t do fine-grained tracking of social sharing of their content, which are two of primary signals that search engines use to determine which articles are the most relevant.)

Story-specific news applications
We are seeing signs of a new kind of hybrid journalism that is as much about software as it is about about reporting. It’s still difficult to put names to what is happening, but terms like “news application” are emerging. There has been much recent discussion of the news app, including a session at the National Institute of Computer-Assisted Reporting conference in February, and landmark posts on the topic at Poynter and NiemanLab. Good examples of the genre include ProPublica’s dialysis facility locator, which combines investigative reporting with a search engine built on top of government data, and the Los Angeles Time’s real-time crime map, which plots LAPD data across multiple precincts and automatically detects statistically significant spikes. Both can be thought of as story-specific search engines, optimized for particular editorial purposes.

Yet the news apps of today are just toes in the water. It is no disrespect to all of the talented people currently working in the field say this, because we are at the beginning of something very big. One common thread in recent discussion of news apps has been a certain disappointment at the slow rate of adoption of the journalist-programmer paradigm throughout the industry. Indeed, with Matt Waite’s layoff from Politifact, despite a Pulitzer Prize for his work, some people are wondering if there’s any future at all in the form. My response is that we haven’t even begun to see the full potential of software combined with journalism. We are under-selling the news app because we are under-imagining it.

I want to apply search engine technology to tell stories. “Story” might not even be the right metaphor, because the experience I envision is interactive and non-linear, adapting to the user’s level of knowledge and interest, worth return visits and handy in varied circumstances. I don’t want a topic page, I want a topic app. Suppose I’m interested in — or I have been directed via headline to — the subject of refugees and internal migration. A text story about refugees due to war and other catastrophes is an obvious introduction, especially if it includes maps and other multimedia. And that would typically be the end of  the story by today’s conventions. But we can do deeper. The International Organization for Migration maintains detailed statistics on the topic. We could plot that data, make it searchable and linkable. Now we’re at about the level of a good news app today. Let’s go further by making it live, not a visualization of a data set but a visualization of a data feed, an automatically updating information resource that is by definition evergreen. And then let’s pull in all of the good stories concerning migration, whether or not our own newsroom wrote them. (As a consumer, the reporting supply chain is not my problem, and I’ve argued before that news organizations need to do much more content syndication and sharing.) Let’s build a search engine on top of every last scrap of refugee-related content we can find. We could start with classic keyword search techniques, augment them by link analysis weighted toward sources we trust, and ingest and analyze the social streams of whichever communities deal with the issue. Then we can tune the whole system using our editorial-judgment-expressed-as-algorithms to serve up the most accurate and relevant content not only today, but every day in the future. Licensed content we can show within our product, and all else we can simply link to, but the search engine needs to be a complete index.

Rather than (always, only) writing stories, we should be trying to solve the problem of comprehensively informing the user on a particular topic. Web search is great, and we certainly need top-level “index everything” systems, but I’m thinking of more narrowly focussed projects. Choose a topic and start with traditional reporting, content creation, in-house explainers and multimedia stories. Then integrate a story-specific search engine that gathers together absolutely everything else that can be gathered on that topic, and applies whatever niche filtering, social curation, visualization, interaction and communication techniques are most appropriate. We can shape the algorithms to suit the subject. To really pull this off, such editorially-driven search engines need to be both live in the sense of automatically incorporating new material from external feeds, and comprehensive in the sense of being an interface to as much information on the topic as possible. Comprehensiveness will keep users coming back to your product and not someone else’s, and the idea of covering 100% of a story is itself powerful.

Other people’s content is content too
The brutal economics of online publishing dictate that we meet the needs of our users with as little paid staff time as possible. That drives the production process toward algorithms and outsourced content. This might mean indexing and linking to other people’s work, syndication deals that let a news site run content created by other people, or a blog network that bright people like to contribute to. It’s very hard for the culture of professional journalism to accept this idea, the idea that they should leverage other people’s work as far as they possibly can for as cheap as they can possibly get it, because many journalists and publishers feel burned by aggregation. But aggregation is incredibly useful, while the feelings and job descriptions of newsroom personnel are irrelevant to the consumer. As Sun Microsystems founder Bill Joy put it, “no matter who you are, most of the smartest people work for someone else,” and the idea that a single newsroom can produce the world’s best content on every topic is a damaging myth. That’s the fundamental value proposition of aggregation — all of the best stuff in one place. The word “best” represents editorial judgement in the classic sense, still a key part of a news organization’s brand, and that judgement can be embodied in whatever algorithms and social software are designed to do the aggregation. I realize that there are economic issues around getting paid for producing content, but that’s the sort of thing that needs to be solved by better content marketplaces, not lawsuits and walled gardens.

None of this means that reporters shouldn’t produce regular stories on their beats, or that there aren’t plenty of topics which require lots of original reporting and original content. But asking who did the reporting or made the content misses the point. A really good news application/interactive story/editorial search engine should be able to teach us as much as we care to learn about the topic, regardless of the state of our previous knowledge, and no matter who originally created the most relevant material.

What I am suggesting comes down to this: maybe a digital news product isn’t a collection of stories, but a system for learning about the world. For that to happen, news applications are going to need to do a lot of algorithmically-enhanced organization of content originally created by other people. This idea is antithetical to current newsroom culture and the traditional structure of the journalism industry. But it also points the way to more useful digital news products: more integration of outside sources, better search and personalization, and story-specific news applications that embody whatever combination of original content, human curation, and editorial algorithms will best help the user to learn.

[Updated 27 March with more material on social signals in search, Bill Joy’s maxim, and other good bits.]
[Updated 1 April with section titles.]

How do we know that short stories do better online?

Yesterday I had an interesting Twitter conversation with Alexis Madrigal, now Technology Editor at The Atlantic, after he posted the following:

@loisbeckett @mat for us, there is a positive correlation between word counts and pageviews. I.e. More words = more views

I asked him if he could say more, and he did:

Continue reading How do we know that short stories do better online?

Short doesn’t mean shallow

Writing style needs to change to take advantage of the hyperlink. That’s the message I want to inject into the discussion about whether deep, long-form writing can survive online — especially long-form journalism. Many people assert that articles need to be shorter online than they are in print, and Nic Carr even famously argues that the internet is making us stupid by destroying our attention span.

But I don’t think the web is shallow at all. I think it’s the deepest medium ever invented, with incredible potential for telling complex, irreducible stories — or at least it can be, if you don’t treat it like print.

Here’s how a long story works in print:

The yellow bits are the unique information that can only be found on that page. The other bits of writing are there to provide the context needed to make sense of the whole , or to get the reader up to speed if they don’t know the backstory. In print, this is absolutely necessary, because a print story is a self-contained object. There is no where else the reader can go to look up an unfamiliar term, to check out a sub-plot, or to investigate the history. That is why we have context paragraphs. That is why we have little definitions of terms that might be unfamiliar to some, and explanations of who a source is and why we should believe them.

Online is different. We can move the context, verification, and background out of the main story, paring the piece down to a thing of streamlined beauty — but all the depth is still there, via links, for anyone who wants it.

When I try to understand how the internet changes communication, one of the points I keep coming back to is the personal nature of online media: it’s now possible to present a different experience to every single reader (user? viewer?) Choosing whether or not to follow a link is a simple way for a reader to tailor the presentation to what they already know, and to indulge their own curiosity. And links let you skip the boring bits.

We’re used thinking of “an article” as a self-contained unit of story. It’s not. The component parts of a story might even be written at different times, published on different sites, or created by different authors.

And now our diagram looks like the web actually looks, lots of pages from different people about different aspects of a very complex world. That is the medium we write in, not some simulation of a stack of paper, no matter what your word processor shows you. Of course, the web also allows fully interactive stories, but it’s often forgotten that hypertext itself is an interactive medium — or it can be, when we put the right links in the right places. People have been experimenting with non-linear stories for decades, but given that a generation or two has now grown up with hypertext, it’s probably time to let our storytelling style grow up too.

Online, short doesn’t necessarily mean shallow. We’re just measuring “depth” wrong when we only look at a single article. That’s not how people actually consume the web, and we shouldn’t force them to.