| Building intelligent systems in open, heterogeneous, dynamic, distributed environments |
SVMs for the Blogosphere: Blog Identification and Splog DetectionAuthors: Pranam Kolari, Tim Finin, and Anupam Joshi Book Title: AAAI Spring Symposium on Computational Approaches to Analysing Weblogs Date: March 27, 2006 Abstract: Weblogs, or blogs have become an important new way to publish information, engage in discussions and form communities. The increasing popularity of blogs has given rise to search and analysis engines focusing on the 'blogosphere'. A key requirement of such systems is to identify blogs as they crawl the Web. While this ensures that only blogs are indexed, blog search engines are also often overwhelmed by spam blogs (splogs). Splogs not only incur computational overheads but also reduce user satisfaction. In this paper we first describe our experiments on blog identification using Support Vector Machines (SVM). We compare results of using different feature sets and introduce new features for blog identification. We then report preliminary results on splog detection and identify future work. Type: InProceedings Organization: Computer Science and Electrical Engineering Publisher: University of Maryland, Baltimore County Note: Also available as technical report TR-CS-05-13 Tags: blog, splog, blogosphere, categorization, blog, metadata, splog, blog, web spam, learning, spam Google Scholar: EGVbfbEUYT4J Number of Google Scholar citations: 79 [show citations] Number of downloads: 6034 Available for download as
Past Projects Bookmark at: Digg | Del.icio.us | Connotea | CiteULike |
| Home | About Us | Contact Us | Site Map | Legal | Privacy Copyright © 1999-2009 UMBC ebiquity research group. Copyright © 2003-2009 Site design and RGB engine code by Filip Perich. XG Page gen 0.021 sec. |