<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jonathan Stray &#187; language</title>
	<atom:link href="http://jonathanstray.com/tag/language/feed" rel="self" type="application/rss+xml" />
	<link>http://jonathanstray.com</link>
	<description>Information, Culture, and Belief</description>
	<lastBuildDate>Tue, 15 May 2012 20:13:21 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Not Quite Global New Year</title>
		<link>http://jonathanstray.com/not-quite-global-new-year</link>
		<comments>http://jonathanstray.com/not-quite-global-new-year#comments</comments>
		<pubDate>Fri, 01 Jan 2010 08:17:22 +0000</pubDate>
		<dc:creator>Jonathan Stray</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[New Year]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://jonathanstray.com/?p=1408</guid>
		<description><![CDATA[Today I have been keeping Twitter window open, watching messages tagged #10yearsago scroll by. It&#8217;s striking. This is the sort of grass-roots expression of hopes and dreams that adventurous journalists used to travel the world for, and compile into coffee table books. Now we can all see it live for free. aricaaa #10yearsago boys still [...]]]></description>
			<content:encoded><![CDATA[<p>Today I have been keeping Twitter window open, watching messages tagged #10yearsago scroll by. It&#8217;s striking. This is the sort of grass-roots expression of hopes and dreams that adventurous journalists used to travel the world for, and compile into coffee table books. Now we can all see it live for free.</p>
<blockquote><p><span class="status-body"><a class="tweet-url screen-name" onclick="pageTracker._trackPageview('/exit/to/aricaaa');" href="http://twitter.com/aricaaa">aricaaa</a> <span id="msgtxt7264131882" class="msgtxt en"><a class="tweet-url hashtag" title="#10yearsago" href="http://twitter.com/search?q=%2310yearsago"><strong>#10yearsago</strong></a> boys still had cooties. ah i miss those days!</span></span></p>
<p><span class="status-body"><a class="tweet-url screen-name" onclick="pageTracker._trackPageview('/exit/to/davidwees');" href="http://twitter.com/davidwees">davidwees</a> <span id="msgtxt7264176721" class="msgtxt en">Happy New Year! <a class="tweet-url hashtag" title="#10yearsago" href="http://twitter.com/search?q=%2310yearsago"><strong>#10yearsago</strong></a> today I was in a dead-end job working in a warehouse.  Now I love what I do and have a great family.</span></span></p>
<p><span class="status-body"><span id="msgtxt7264172915" class="msgtxt en"><a class="tweet-url username" onclick="pageTracker._trackPageview('/exit/to/scottharrison')" href="http://twitter.com/scottharrison">scottharrison</a>: <a class="tweet-url hashtag" title="#10yearsago" href="http://twitter.com/search?q=%2310yearsago"><strong>#10yearsago</strong></a> I was a sycophant and a drunk selling vodka to bankers in clubs. Grateful for God&#8217;s grace and sense of humor.</span></span></p>
<p><span class="status-body"><a class="tweet-url screen-name" onclick="pageTracker._trackPageview('/exit/to/Sirenism');" href="http://twitter.com/Sirenism">Sirenism</a> <span id="msgtxt7264038718" class="msgtxt en"><a class="tweet-url hashtag" title="#10yearsago" href="http://twitter.com/search?q=%2310yearsago"><strong>#10yearsago</strong></a> I was eleven and one of my brothers friends tried to kiss me at midnight. I punched him in the nuts.</span></span></p>
<p><span class="status-body"><a class="tweet-url screen-name" onclick="pageTracker._trackPageview('/exit/to/cosmicjester');" href="http://twitter.com/cosmicjester">cosmicjester</a> <span id="msgtxt7264095755" class="msgtxt en">Holy Shit <a class="tweet-url hashtag" title="#10yearsago" href="http://twitter.com/search?q=%2310yearsago"><strong>#10yearsago</strong></a> I met a girl at a friends birthday party, we both liked Red Dwarf and the Beatles. Then she became the girl.</span></span></p>
<p><span class="status-body"><a class="tweet-url screen-name" onclick="pageTracker._trackPageview('/exit/to/shaunraney');" href="http://twitter.com/shaunraney">shaunraney</a> <span id="msgtxt7264144815" class="msgtxt en"><a class="tweet-url hashtag" title="#10yearsago" href="http://twitter.com/search?q=%2310yearsago"><strong>#10yearsago</strong></a> was the the saddest day of my life.</span></span></p></blockquote>
<p>As striking as this is, I notice that almost all of the traffic is in English. The only other language reasonably well represented is Indonesian. Curious, though it is the 4th largest country by population, and social media are hugely popular here.</p>
<p>I&#8217;ve  also really enjoyed watching the clock strike midnight in different time zones. Here in Jakarta, the NYE conversations of my friends in California &#8212; 13 hours behind &#8212; seem so last night. I&#8217;m nursing a hangover, they&#8217;re working on one.</p>
<p>It&#8217;s so easy to forget the world outside what you know. I hope that global media like Twitter will help us to remember everyone else. The technological means have arrived with a roar, but we&#8217;re still not really talking to one another. What is the next step?</p>
]]></content:encoded>
			<wfw:commentRss>http://jonathanstray.com/not-quite-global-new-year/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How Many World Wide Webs Are There?</title>
		<link>http://jonathanstray.com/how-many-webs</link>
		<comments>http://jonathanstray.com/how-many-webs#comments</comments>
		<pubDate>Wed, 04 Feb 2009 23:53:51 +0000</pubDate>
		<dc:creator>Jonathan Stray</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[belief]]></category>
		<category><![CDATA[information]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[technology]]></category>
		<category><![CDATA[translation]]></category>

		<guid isPermaLink="false">http://jonathanstray.com/?p=257</guid>
		<description><![CDATA[How much overlap is there between the web in different languages, and what sites act as gateways for information between them? Many people have constructed partial maps of the web (such as the  blogosphere map by Matthew Hurst, above) but as far as I know, the entire web has never been systematically mapped in terms [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><a href="http://datamining.typepad.com/gallery/blog-map-gallery.html"></a><a href="http://datamining.typepad.com/gallery/blog-map-gallery.html"><img class="alignnone size-medium wp-image-328" title="newblog-crop" src="http://jonathanstray.com/wp-content/uploads/2009/02/newblog-crop-300x274.png" alt="newblog-crop" width="300" height="274" /></a></p>
<p>How much overlap is there between the web in different languages, and what sites act as gateways for information between them? Many people have constructed partial maps of the web (such as the  <a href="http://datamining.typepad.com/gallery/blog-map-gallery.html">blogosphere map</a> by Matthew Hurst, above) but as far as I know, the entire web has never been systematically mapped in terms of language.</p>
<p>Of course, what I actually want to know is, how connected are the different cultures of the world, really? We live in an age where the world seems small, and in a strictly technological sense it is. I have at my command this very instant not one but several enormous international communications networks; I could email, IM, text message, or call someone in any country in the world. And yet I very rarely do.</p>
<p>Similarly, it&#8217;s easy to feel like we&#8217;re surrounded by all the international information we could possibly want, including direct access to foreign news services, but I can only read articles and watch reports in English. As a result, information is firewalled between cultures; there are questions that could very easily be answered by any one of tens or hundreds of millions of native speakers, yet are very difficult for me to answer personally. For example, what is the journalistic slant of <a href="http://www.aljazeera.net/portal">al-Jazeera</a>, the original one in Arabic, not the <a href="http://www.aljazeera.com/index.html">English version</a> which is produced by a completely different staff?  Or, suppose I wanted to know what the average citizen of Indonesia thinks of the sweatshops there, or what is on the front page of the Shanghai Times today&#8211; and does such a newspaper even exist? What is written on the 70% of web pages that are not in English?</p>
<p><span id="more-257"></span>We all live on the same physical planet, but the information worlds we inhabit must be vastly different. This are many reasons for this other than language, but language alone is enough to isolate humanity from itself.</p>
<p>And so, my question: how many islands are there in our multi-cultural information space, and how are they connected? I am willing to bet that a full-scale web map would show several large networks in the <a title="What languages are web pages in?" href="http://www.internetworldstats.com/stats7.htm">main languages of the web</a> &#8212; English, Chinese, Spanish, Japanese, German, etc. &#8212; but few connections between them, web sites frequented by bilingual or bi-cultural individuals, who after all are the true gateways between cultures. The structure of the interconnections might tell us something about the relationships between cultures, and the actual number of links might provide some measure of how close or how far apart we actually are. The individual URLs themselves would also be extremely valuable information, representing high-bandwidth links between cultures, the trans-occeanic fiber between continents in the infosphere.</p>
<p>There is a second geography to the world that we&#8217;ve never seen. I don&#8217;t even know what I&#8217;m missing.</p>
<p>Creating such a map would be a trick, but by no means out of the reach of an academic project or a small company. <a href="http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html">Google</a> says there are currently over one trillion (10^12) unique web pages (for their particular definition of &#8220;unique&#8221;, which is more complex than it might seem.) Unlike a search engine, a language-based web map does not require the full contents of every page, merely the outgoing URLs and a discrete categorization of the language (which can be <a title="TextCat, mmrow!" href="http://odur.let.rug.nl/~vannoord/TextCat/">automatically determined</a> even without any document meta-data.)  Assuming that each URL  is assigned a unique 32 bit ID, another 32 bits for language and other info, and then links to an average of 20 other pages (<a href="http://uclue.com/index.php?xq=1015">estimates vary</a>), this is about 100 terrabytes of data &#8212; or perhaps $15000 worth of storage at current prices. This index could be created from a fresh crawl, or by parsing an existing one, such as from the folks at the brand new and very awesome <a title="Sooo Awesome!" href="http://www.dotnetdotcom.org/">DotBot open index of the web</a>.</p>
<p>The next step would be to generate the visualization of such a massive data set. The complete graph could be laid out in two or three dimensions using existing <a href="http://www.informatik.tu-cottbus.de/~an/GD/">clustering methods</a>. The resulting map could be traversed using GPU-accelerated rendering techniques for very large data sets, probably after some sort of hierarchical pre-processing that produces proxies for zoomed-out views of the network. A usuable UI would be crucial; the entire map needs to be navigable at multiple scales and composed of live, hyperlinked objects. The right visualization also depends on what you are trying to discover;  ultimately, there can be no single map because the choice of visualization is dependent upon usability and aesthetics, as the huge variety of beautiful maps at <a href="http://www.visualcomplexity.com/vc/">Visual Complexity</a> demonstrate.</p>
<p>The analysis could go much deeper with more computing power. Machine translation is currently poor, but it is probably good enough to detect whether one document is a translation of another. With this capability, we would actually be able to quantify the percentage of (public) textual information that makes it from one language into another and identify the key organizations that act as conduits. Further study might reveal fascinating things, such as selection biases in the types of news or information that get translated. The implications for differences in belief between cultures are obvious.</p>
<p>Yet even  a &#8220;links only&#8221; data set could still answer some highly revealing questions,  such as &#8220;what percentage of web sites are visited by people from multiple cultures?&#8221; or even &#8220;what is the best gateway between Polish and English film reviews?&#8221; This could be done without visualization, but it would be a mistake not to draw the actual maps.  Not only do pictures engage our spatial reasoning in a way that raw bits never can, but such a map would re-make an obvious point that is too often lost: in terms of communication between cultures, the world is not nearly as small or interconnected as we&#8217;d like to think it is.</p>
]]></content:encoded>
			<wfw:commentRss>http://jonathanstray.com/how-many-webs/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>In Ur Suburb, Selling U Burgers</title>
		<link>http://jonathanstray.com/in-ur-suburb-selling-u-burgers</link>
		<comments>http://jonathanstray.com/in-ur-suburb-selling-u-burgers#comments</comments>
		<pubDate>Tue, 19 Aug 2008 07:04:02 +0000</pubDate>
		<dc:creator>Jonathan Stray</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cats]]></category>
		<category><![CDATA[culture]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[lolspeak]]></category>
		<category><![CDATA[omg!!]]></category>

		<guid isPermaLink="false">http://jonathanstray.com/?p=39</guid>
		<description><![CDATA[OMG!!! The local Burger King Slurpee machine sez: &#8220;cool it with ur fav flav.&#8221; Lolspeak is in ur multi-national! Corporate America can has teh funny? No wai, only wants sell cheezburgers! Co-option of kulcher? Mebbeh. Or mebbeh teh kittehs win! LOL! Mai theweh, let me show you it. Lolspek is awsum meme, like new languish, [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><a href="http://jonathanstray.com/wp-content/uploads/2008/08/coolitwithurfavflav.jpg"><img class="alignnone size-full wp-image-59 aligncenter" title="coolitwithurfavflav" src="http://jonathanstray.com/wp-content/uploads/2008/08/coolitwithurfavflav.jpg" alt="Burger King has gone lol!!!" width="284" height="241" /></a></p>
<p>OMG!!! The local Burger King Slurpee machine sez: &#8220;cool it with ur fav flav.&#8221; <a href="http://icanhascheezburger.com">Lolspeak</a> is in ur multi-national!</p>
<p>Corporate America can has teh funny? No wai, only wants sell cheezburgers! Co-option of kulcher? Mebbeh. Or mebbeh teh kittehs win!</p>
<p>LOL!</p>
<p>Mai theweh, let me show you it. Lolspek is awsum meme, like new languish, even haz <a href="http://speaklolspeak.com/">dik-shun-ary</a>. Teh hoomans who lieks kittehs lieks cheezburgers too, so teh burger stoar lurnz lolspeaks.</p>
<p>But mebbe if enuf mawket-urs has teh lolspeak, iz not &#8220;authentic&#8221; n e moar? I doan no! Kwestions of &#8220;authenticity&#8221; in kulcher make mai hed asplode!</p>
]]></content:encoded>
			<wfw:commentRss>http://jonathanstray.com/in-ur-suburb-selling-u-burgers/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Americans Have Only Their Own Culture</title>
		<link>http://jonathanstray.com/americans-have-only-their-own-culture</link>
		<comments>http://jonathanstray.com/americans-have-only-their-own-culture#comments</comments>
		<pubDate>Wed, 16 Jul 2008 01:25:10 +0000</pubDate>
		<dc:creator>Jonathan Stray</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[America]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[media]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://jonathanstray.com/?p=27</guid>
		<description><![CDATA[The whole world watches Hollywood movies. I once found X-Men 2 on cable in Oman, the sex and violence airing between the preaching Imams. The whole world reads Western books, either in English or translation. The Da Vinci Code graces the dirty blankets of sidewalk booksellers in Mumbai, and Harry Potter is truly global. Those [...]]]></description>
			<content:encoded><![CDATA[<p>The whole world watches Hollywood movies. I once found <em>X-Men 2</em> on cable in Oman, the sex and violence airing between the preaching Imams. The whole world reads Western books, either in English or translation. <em>The Da Vinci Code</em> graces the dirty blankets of sidewalk booksellers in Mumbai, and <em>Harry Potter</em> is truly global.</p>
<p>Those who don&#8217;t live in America are lucky. They have at least two cultures: their own, and the American imports. Those who live within America are impoverished by comparison. Americans have to go well out of their way to consume media made by people who aren&#8217;t like them. We have to go to the &#8220;Foreign&#8221; section of the video store. We have to suffer through languages we don&#8217;t understand, because we are taught only English in schools.</p>
<p>This same effect is repeated on a smaller scale with regional cultural capitals. In Southeast Asia, all the good movies come from Thailand. In Nepal, everything is from India. South Africa produces most of the African media, while Qatar and Egypt supply the Arab world. In every case, media in the minority countries is often much more diverse, drawing from many sources.</p>
<p>Maybe this is imperialism. Maybe this is a bad thing. Maybe every peoples should be producing their own entertainments just as furiously as Hollywood. Maybe. My point is only this: if you live outside of the Empire, the Empire comes to you. But if you live inside, you have to look to find the rest of the world.</p>
]]></content:encoded>
			<wfw:commentRss>http://jonathanstray.com/americans-have-only-their-own-culture/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

