Skip to content

Jonathan Stray

About me

Publications and Talks

Blog

Contact

Greatest Hits

  • Startups vs. Systems
  • Extracting campaign finance data using deep learning
  • Defense Against the Dark Arts: Networked Propaganda and Counter-Propaganda
  • What I learned at the first conference on technology and peacebuilding
  • The Editorial Product
  • A full-text visualization of the Iraq War Logs

Tags

Africa AI art belief censorship China climate change community computational journalism consciousness consumerism culture developing world dickheads economics energy funny information information visualization internet iran Iraq journalism knowledge language marketing media minds news obama personal politics public health risk science sex social media storytelling technology transparency travel twitter visualization wikipedia world peace

Archives

  • October 2020
  • June 2019
  • May 2019
  • March 2017
  • February 2017
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • January 2016
  • July 2015
  • March 2015
  • October 2014
  • June 2014
  • April 2014
  • July 2013
  • June 2012
  • May 2012
  • January 2012
  • December 2011
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • July 2011
  • May 2011
  • April 2011
  • March 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • September 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008

Copyright


This work is licensed under a Creative Commons Attribution-No Derivative Works 3.0 United States License.

Creative Commons License

Jonathan Stray

Information, culture, and belief

Archives

The world cannot be represented in machine-readable form

Posted onApril 15, 2010April 18, 2010Tagsartificial intelligence, information, journalism, knowledge, linked data, metadata, ontology, technology5 Comments

UPDATE: Debrouwere continues the conversation with a response to the key points here, in the comments to his original post.

Dutch journalist/coder Stijn Debrouwere has written a very thorough post describing the ways in which standard tags, like the ones on this blog or on Flickr, fall short when applied to news articles. There are lots of things we might like to know about a story, such as where and when it happened and who was involved. This additional information, sort of like the index to a book, is known as “metadata”, and there is within the online journalism community a great call for its use, including by Debrouwere:

Each story could function as part of a web of knowledge around a certain topic, but it doesn’t.

So here’s a well-intentioned idea you’ve heard before: journalists should start tagging. Jay Rosen insists that “Getting disciplined and strategic about tagging” may be one way professional journalism separates itself from the flood of cheap content online.” Tags can show how a news article relates to broader themes and topics. Just the ticket.

News metadata is a major topic, and many people have speculated deeply about the value of creating news metadata at the time of reporting, such as the ever-sarcastic Xark and the thoughtful Martin Belam who writes about why “linked data” is good for journalism. But I’m going to respond to Debrouwere because I read him today, because he has lovely diagrams that explain his good ideas, and because, in criticizing “tags” as a form of metadata, I think he misses some very important points.

And he’s not alone. My sense is that many of the coder-journalists of today have not learned from the mistakes of generations of technically-minded people who wished to talk about the world in more precise ways.

Moving forward from simple tagging, Debrouwere imagines more sophisticated annotation schemes that start to pick up on what the tags actually mean. For starters, the tags could be drawn from separate “vocabularies.” Does a tag refer to a person, or a place, or perhaps an event? Debrouwere uses the following picture, which I’m going to borrow here because it explains the idea so nicely:

5-types-of-relationships

But, he says, we can get even more sophisticated. What did the story actually say? If it mentioned a person, what did it say about them? Was it an interview? A profile? Did it criticize them? Here’s the diagram he draws: Continue reading The world cannot be represented in machine-readable form

Jonathan Stray | Toko Online powered by WordPress