Evan Williams TED talk on Twitter

February 28th, 2009

Twitter founder Evan Williams gave a TED talk this earlier this month on how Twitter’s growth is driven by unexpected uses. His eight minute talk touched on twittering during dramatic events, political uses, services enabled by their API and the emergence of conventions like @reply and #hashtag.

Ian Davis code{4}lib keynote: data outlasts code

February 27th, 2009

Ian Davis, CTO of Talis, posted the slides from his code4lib2009 keynote talk on slideshare. If you love something… set it free gives a very nicely done description of the motivation behind and hopes for the Semantic Web.

Code{4}lib is a conference series and community focused on the intersection of libraries, technology, and the future. code4lib2009 was held this week in Providence, hosted by the Brown University Library.

Ian’s talk contained three conjectures, the first of which I especially liked:

  • Conjecture 1: Data outlasts code
  • Conjecture 2: There is more structured data in the world than unstructured
  • Conjecture 3: Most of the value in our data will be unexpected and unintended

(h/t Danny Ayers)

Unlocked developer Android G1 hobbled

February 26th, 2009

Macworld reports, in Google blocks paid apps for unlocked G1 users, that Google made a recent change in the capabilities of the unlocked G1 Android phone.

“People who bought an unlocked version of the Android G1 phone are no longer allowed to download new paid applications from the Market, after a change Google made late last week. Google is prohibiting users of the unlocked phones from viewing copy-protected applications, including those that cost to download.”

Gizmodo describes the reason, or a least one very plausible one.

“The problem lies in the phone’s full software permissions. Consumer Android phones download paid content to a private, hidden apps folder, inaccessible to the user. Thing is, as is stands, this normally inaccessible folder is accessible on the dev phones. Not only does this let people flat out copy and redistribute apps—it enables a sort of app laundering scam, in which someone buys an app, copies it to another location, and gets a refund for the app (as per the Marketplace’s 24-hour return policy), only to reinstall the copied version later.”

We purchased an unlocked G1 last month and are using it in several research projects. Not being able to access the paid apps should not be a showstopper, but it would be nice to try some out, so I hope a solution to this problem can be worked out soon.

Twitter-Calais mashup tracks IL-5 election buzz

February 24th, 2009

WindyCitizen.com is “a crowd-powered front page for the Windy City” that “brings Chicagoans the best of the local web by letting them share, rate and discuss their favorite local news, photos, videos and more.”

Their Windy City Twitter Tracker mashup uses Open Calais as a named entity recognizer to track Tweets about candidates in the special election to fill the US House seat for Chicago’s 5th district that that Rahm Emanuel vacated. Calais might be overkill for this, since there is a small set of known candidates, but it’s an impressive semantic mashup nonetheless.

“We’re searching Twitter constantly to keep you up to date with the conversation about the IL-5 special election. The graph above lets you track buzz about the candidates over the last two weeks.”

The Windy City Twitter Tracker is probably written to be easily repurposed, judging from the Web site, which describe it as currently tracking the “Race for the 5th”. The mashup is credited to Whattech.

Facebook is eating our children’s brains

February 24th, 2009

“Facebook harms children’s brains.” That was the alarming meta-headline i saw this morning when I looked at the Gardian’s Web site.

The story, Facebook and Bebo risk ‘infantilising’ the human mind, starts off

“Social network sites risk infantilising the mid-21st century mind, leaving it characterised by short attention spans, sensationalism, inability to empathise and a shaky sense of identity, according to a leading neuroscientist.

The startling warning from Lady Greenfield, professor of synaptic pharmacology at Lincoln college, Oxford, and director of the Royal Institution, has led members of the government to admit their work on internet regulation has not extended to broader issues, such as the psychological impact on children.”

I wish Lady Greenfield would tell us what she really thinks.

I can remember when it was comic books that were corrupting the minds of a new generation. Then it was the music (it’s almost always the music), TV, birth control, hippie values, MTV, consumerism, texting and now this. And Professor Greenfield may not even be aware of Twitter!

I’d like to see some data.

Persistent Identifiers for Earth Science Provenance

February 23rd, 2009

In this week’s ebiquity meeting (10:00am EDT Wed 2/25, ITE 325), Curt Tilmes will talk on “Persistent Identifiers for Earth Science Provenance“.

Historically, published scientific research could include a description of an experiment that an independent party could use to reproduce the experiment with the same results, confirming the research. Modern research in the field of earth science often depends on terrabytes of data captured from remote sensing instruments, complex computer algorithms that undergo numerous changes over the year. A single result could be the result of the work of hundreds of individuals over decades. The representation of the measurements, algorithms and all the other artifacts of experimentation leading to that result becomes a daunting problem. A key to handling this representation is a good scheme for persisent identifiers.

Persistent identifiers seem like a simple problem. Just make a good URL and don’t change it [1]. This sounds good in theory, but is difficult to maintain forever. Many other schemes have been proposed to attack various aspects of the problem of identification, with various advantages and disadvantages. I will introduce this topic and briefly describe some of the concerns with using identifiers specifically in the context described above, and some of the characteristics of various identifier schemes.

The presentation will be streamed live via ustream.tv

References and some identifier schemes

[1] Cool URIs Don’t Change
[2] Naming and Addressing: URIs, URLs, …
[3] Object Identifer (OID)
[4] The Digital Object Identifier (DOI) System
[5] Persistent Uniform Resource Locator
[6] A Universally Unique IDentifier (UUID) URN Namespace
[7] XRI (Extensible Resource Identifier)

Republicans vs. Democrats in Python

February 22nd, 2009

In the US, we’re pretty much locked into a two party system. It’s not that the two parties are founded on two opposing political philosophies — they can and do switch positions — but that once a two party system takes hold, it’s hard to dislodge.

Game theory suggests that a two party system will promote partisanship, especially if there are party-run primary elections. As a result, each party is more likely to elect a more partisan candidate than a centrist.

Peter Norvig has an interesting post, Lieberman, Egg, Sausage and Lieberman exploring a simple simulation. He was motivated by an earlier post by Nate Silver, Land of a Thousand Liebermans, on the fivethirtyeight blog. He wrote some simple Python code to vary some of the assumptions in Silver’s simple model.

This is a great example of using simple programs in Python to explore ideas. We are switching our CS 101 course to Python in the fall and something like this could make an interesting project.

On the FaceBook economy

February 22nd, 2009

Nobody seems to work much on Facebook

Stimulus Watch: propose and vote on shovel ready projects

February 20th, 2009

Stimulus Watch is a new wiki-like site that is intended to “help the new administration keep its pledge to invest stimulus money smartly, and to hold public officials to account for the taxpayer money they spend.”

“We do this by allowing you, citizens around the country with local knowledge about the proposed “shovel-ready” projects in your city, to find, discuss and rate those projects. These projects are not part of the stimulus bill. They are candidates for funding by federal grant programs once the bill passes.”

The site lets you search for program by keywords or browse by geographic region of project type. When you find a program of interest, you can vote on whether you believe the project is critical or not; post a comment in the conversation about the project and even edit the project’s description and points in favor or against.

The most expensive project proposed is one that suggests building a new energy efficiency industrial zones on 100 acres in Cidra PR for $17.5B. That’s a lot of shovels, but only three percent of voters thought it was critical! The project currently most favored by the Stimulus Watch community is one suggesting that new nursing homes be constructed around the country for veterans. 78% of the voters thought that this was a good way to spend $4.3M and create 310 jobs.

Facebook blinks, reverts to old Terms of Service agreement

February 18th, 2009

Late last night Facebook CEO Mark Zuckerberg announced in a blog post, Update on Terms, that they have rolled back the recent changes to their Terms of Service agreement and restored the previous one.

“Many of us at Facebook spent most of today discussing how best to move forward. One approach would have been to quickly amend the new terms with new language to clarify our positions further. Another approach was simply to revert to our old terms while we begin working on our next version. As we thought through this, we reached out to respected organizations to get their input.

Going forward, we’ve decided to take a new approach towards developing our terms. We concluded that returning to our previous terms was the right thing for now. As I said yesterday, we think that a lot of the language in our terms is overly formal and protective so we don’t plan to leave it there for long.”

The NYT reported the change in a story today, Facebook Withdraws Changes in Data Use.

In his post, Zuckerberg continued by observing that with 175 million members, if it were a country, it would be the sixth most populated one in the world. Of course, sometimes a population revolts and lays claim to certain unalienable rights, among theme being life, liberty, pursuit of happiness and ownership of one’s online content.

So, the missing clause is back in the FB TOS:

“You may remove your User Content from the Site at any time. If you choose to remove your User Content, the license granted above will automatically expire, however you acknowledge that the Company may retain archived copies of your User Content.”

This revision is dated 23 September 2008. Curiously, I checked the Internet Archive to review the history of FB’s TOS but found that there are no archived copies after 12 October 2007. I can only imagine that FB asked the Internet Archive to stop saving copies of this public page. I note that the last archived copies of many of their public pages (e.g., privacy policy, developers page, etc.) are also from 2007. These pages are not blocked by the FB robots.txt and are normally accessible to anyone, so it must be by a specific request that they not be archived.

That’s too bad. Having an easy way to see how the policies of important social sites like FB evolve would be a great resource to those who study online social media as well as to many curious users.

Twitter as the Web stream of consciousness

February 15th, 2009

TechCrunch has a post Mining The Thought Stream on why Twitter continues to be hot even thought it doesn’t yet have a business case. The argument is that Twitter has fond a niche that none of the search engines covers well — providing visibility over the stream of consciousness of the Web. The final graf caught my attention:

“An undifferentiated thought stream of the masses at some point becomes unwieldy. In order to truly mine that data, Twitter needs to figure out how to extract the common sentiments from the noise (something which Summize was originally designed to do, by the way, but it was putting the cart before the horse—you need to be able to do simple searches before you start looking for patterns). But what is the best way to rank real-time search results—by number of followers, retweets, some other variable? It is not exactly clear. But if Twitter doesn’t solve this problem, someone else will and they will make a lot of money if they do it right.”

Akshay looked at the problem of analyzing tweets back in 2007 (see Why We Twitter: Understanding Microblogging Usage and Communities). One difficulty is that tweets are necessarily short and telegraphic. This makes it hard to do any linguistic analysis with good accuracy. But, maybe if you can apply some back ground knowledge……

Facebook owns your content. All of it. Forever.

February 15th, 2009

2/18 Update: FB reverted its TOS to the previous version early on 18 Feb 2009.

Consumerist has a post on a change in Facebook’s Terms of Service agreement that became effective on 4 February: Facebook’s New Terms Of Service: “We Can Do Anything We Want With Your Content. Forever.”

Both the new Facebook TOS and the previous TOS made these aggressive claims on your content.

“You hereby grant Facebook an irrevocable, perpetual, non-exclusive, transferable, fully paid, worldwide license (with the right to sublicense) to (a) use, copy, publish, stream, store, retain, publicly perform or display, transmit, scan, reformat, modify, edit, frame, translate, excerpt, adapt, create derivative works and distribute (through multiple tiers), any User Content you (i) Post on or in connection with the Facebook Service or the promotion thereof subject only to your privacy settings or (ii) enable a user to Post, including by offering a Share Link on your website and (b) to use your name, likeness and image for any purpose, including commercial or advertising, each of (a) and (b) on or in connection with the Facebook Service or the promotion thereof.”

That was bad enough, but at least Facebook relinquished those rights on your content if you dropped out. But no longer. The following clause from the old TOS has been dropped.

“You may remove your User Content from the Site at any time. If you choose to remove your User Content, the license granted above will automatically expire, however you acknowledge that the Company may retain archived copies of your User Content.”

Just to make it absolutely clear how screwed you are, the new TOS also adds the following.

“The following sections will survive any termination of your use of the Facebook Service: Prohibited Conduct, User Content, Your Privacy Practices, Gift Credits, Ownership; Proprietary Rights, Licenses, Submissions, User Disputes; Complaints, Indemnity, General Disclaimers, Limitation on Liability, Termination and Changes to the Facebook Service, Arbitration, Governing Law; Venue and Jurisdiction and Other.”

By the way, if you’ve used Facebook in any way since 4 February, you have already accepted the new TOS.

“We reserve the right, at our sole discretion, to change or delete portions of these Terms at any time without further notice. Your continued use of the Facebook Service after any such changes constitutes your acceptance of the new Terms.”

And if you want to take them to court, Fugetaboutit.

“Except as set forth in the paragraph below, you agree that all claims and disputes between you and Facebook that arise out of or relate in any way to the Terms or your use of the Facebook Service will be resolved either by (a) binding arbitration by a single arbitrator in Santa Clara County, California or (b) binding non-appearance based arbitration conducted by telephone, online or based solely on written submission.”

All your base are belong to Facebook.