Ebiquity PhD student Akshay Java has been collecting Twitter data since March 2007 and has just written a paper that analyzes the subset from April and May of this year.
Why We Twitter: Understanding Microblogging Usage and Communities, Akshay Java, Xiaodan Song, Tim Finin, and Belle Tseng, Joint 9th WEBKDD and 1st SNA-KDD Workshop, August 2007.
His dataset included about 1350K posts from over 75K users. The paper covers a lot of the standard statistics you would expect — usage trends, basic network properties, top hubs and authorities, community structure, and geographic distribution. Akshay’s title pays homage to the early paper that asked why we blog, but the title also reflects the paper’s key contribution — an attempt to tease out the user’s intention in writing a tweet, i.e. to analyze why people are using Twitter.