ICWSM-2007 Weblog dataset released
Tim Finin, 1:00pm 8 September 2006The International Conference on Weblogs and Social Media (26-28 March 2006, Boulder CO, USA) is offering a large blog dataset to conference participants. The data release comprises a complete set of weblog posts collected by Nielsen BuzzMetrics for May 2006. It consists of about 14M weblog posts in XML format from 3M weblogs and is annotated with 1.7M blog-blog links. The marked-up fields include: date of posting, time of posting, author name, title of the post, weblog url, permalink, tags/categories, and outlinks classified by type. The compressed dataset is over 10GB. In addition to the data, the conference organizers hope to release processing code and a shared repository for those making use of the dataset. Details on requesting the dataset are available online.
August 18th, 2008 at 6:21 am
thanks
August 18th, 2008 at 6:22 am
i want to use in my project.
October 14th, 2009 at 2:38 am
i want to research for weblog minning, thanks