| Building intelligent systems in open, heterogeneous, dynamic, distributed environments | 07 September 2008, 23:46:51 EDT ![]() |
|||
Characterizing the Splogosphere Authors: Pranam Kolari, Akshay Java, and Tim Finin Book Title: Proceedings of the 3rd Annual Workshop on Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 15th World Wid Web Conference Date: May 23, 2006 Abstract: Weblogs or blogs collectively constitute the Blogosphere, forming an influential and interesting subset on theWeb. As with most Internet-enabled applications, the ease of content creation and distribution makes the blogosphere spam prone. Spam blogs or splogs are blogs hosting spam posts, created using machine generated or hijacked content for the sole purpose of hosting ads or raising the PageRank of target sites. These splogs make up the splogosphere, and are now inundating blog search engines and update ping servers. In this work we characterize splogs by comparing them against authentic blogs. Our analysis is based on a dataset made publicly available by BlogPulse, and employs a machine learning model that detects splogs with an accuracy of 90%. To round off this analysis and to better understand splogs, we also present our study of a popular blog update ping server, and show how they are overwhelmed by pings sent by splogs. This overall study will facilitate finding effective new techniques to detect and weed out splogs from the blogosphere. Type: InProceedings Organization: Computer Science and Electrical Engineering Publisher: University of Maryland, Baltimore County Google Scholar: lwK1neoG4xkJ Number of Google Scholar citations: 32 [show citations] Number of downloads: 2335 Available for download as
Active Projects Bookmark at: Digg | Del.icio.us | Connotea | CiteULike |
| Home | About Us | Contact Us | Site Map | Legal | Privacy Copyright © 1999-2008 UMBC ebiquity research group. Copyright © 2003-2008 Site design and RGB engine code by Filip Perich. XG Page gen 0.021 sec. |