Proceedings of the Fifteenth Text REtrieval Conference (TREC 2006)

The BlogVox Opinion Retrieval System

, , , , , and

The BlogVox system retrieves opinionated blog posts specified by ad hoc queries. BlogVox was developed for the 2006 TREC blog track by the University of Maryland, Baltimore County and the Johns Hopkins University Applied Physics Laboratory using a novel system to recognize legitimate posts and discriminate against spam blogs. It also processes posts to eliminate extraneous non-content, including blog-rolls, link-rolls, advertisements and sidebars. After retrieving posts relevant to a topic query, the system processes them to produce a set of independent features estimating the likelihood that a post expresses an opinion about the topic. These are combined using an SVM-based system and integrated with the relevancy score to rank the results. We evaluate BlogVox's performance against human assessors. We also evaluate the individual splog filtering and non-content removal components of BlogVox.


  • 477365 bytes

blog, information retrieval, opinion extraction, spogs

InProceedings

Downloads: 4450 downloads

Google Scholar Citations: 7 citations

UMBC ebiquity