Twitter Social Network Analysis
By Akshay Java on Thursday, April 19th, 2007 at 1:00 pm.In the recent series of posts, we have presented Twitter Goolgle Maps mashup, a Twitter search and buzz tracking tool called Twitterment and analysis of geolocation information from the twitter dataset. By providing a neat API, Twitter has enabled researchers to get a better understanding of Microblogging.
In this post, I have used the Large Graph Layout (LGL) tool to visualize the social network on Twitter. Following is a graph constructed using contacts from about 25K users. Notice that there is a link connecting two users if either one has the other as a friend and hence it is an undirected graph (of about 250K edges).
Compare this to the following graph that is constructed using only users who are mutually acquainted. i.e. A knows B and also B knows A.
I find that visualizing such large graphs is quite a challenge and to glean meaningful information from it is even more difficult. However there are a few insights one can gain from this:
- Interestingly, there are a number of users who are trying to win a popularity contest of some sorts! The complete list of users ranked by the number of friends they have is shown here.
- A number of bloggers and (perhaps fake?) celebrity profiles have a huge fan following in Twitter. Here is a list of users ranked by number of followers.
- The two graphs shown above look very different on account of the fact that users with public profiles get a lot of followers whom they might not really know and would hence never add them as an acquaintance (well, in most cases atleast). But to really understand what the differences are one would need to look at the community structure and properties of the two graphs.
Finally, for completeness, here is a list of users ranked according to their PageRank scores. It is noticeably similar to the rankings generated by Twitterholic. This can be explained by the fact that local metrics (like number of followers) in a social network are a good first order approximation of rank. Dr. Finin made me aware of research by social network expert Valdis Krebs, who uses “reach” as a measure in human social networks. Here a person’s reach is the number of other people that are within N links in the network where N is usually 1, 2 or 3 for human networks. So, Twitterholic rank for example is the case with N equal to 1.
[Thanks Eytan and Matt for suggestions on Graph Visualization tools. Related: Matt, Bruno’s posts on network visualization of Belgian bloggers]




April 7th, 2008 at 8:23 am
I’ve been a fan of twitter from the first time I’ve seen it.
Ever since I’ve learned about the concept, my interest in microblogging and its potential to change the world has been steadily growing.
April 23rd, 2008 at 7:10 pm
Hmm, interesting. Unfortunately, none of your text files seem reachable from the outside world… I get this error message:
“Safari can’t open the page “http://twitterment.umbc.edu/friends.txt†because the server unexpectedly dropped the connection, which sometimes occurs when the server is busy. You might be able to open the page later.”
Would love to see those data points…!
May 28th, 2008 at 12:14 pm
hi Akshay,
Is the link for the users ranked by Pagerank an intentionally blank file? I couldnt see anything…I know the URL is prank.txt, so Im just wondering if its an extension of something Chris is seeing..
July 4th, 2008 at 10:19 am
We had some issues with our Wordpress upgrade and hence the files are no longer reachable. Let me try to correct it and update the appropriate links.