UMBC ebiquity

Blogvox2: A Modular Domain Independent Sentiment Analysis System

Authors: Sandeep Balijepalli

Date: June 07, 2007

Abstract: Bloggers make a huge impact on society by representing and influencing the people. Blogging by nature is about expressing and listening to opinion. Good sentiment detection tools, for blogs and other social media, tailored to politics can be a useful tool for today’s society. With the elections around the corner, political blogs are vital to exerting and keeping political influence over society. Currently, no sentiment analysis framework that is tailored to Political Blogs exist. Hence, a modular framework built with replicable modules for the analysis of sentiment in blogs tailored to political blogs is thus justified. I propose Blogvox2, an information retrieval based domain independent sentiment analysis framework that uses customized pattern matching techniques, such as nave bayesian filter, bag of words and part of speech tagging are used for opinion extraction in blogs. We also developed prototype two-panel and four-panel search applications of the query results. In addition, we also analyze opinionated sentences to identify trends on the hot and top topics. The modular framework of of Blogvox2 provides a platform where new modules for different domains can be easily plugged in. The framework provides the date of publishing, permanent link and the urls of the sentences that expresses opinions based on the analysis. Based on the analysis of the blogvox2 on political domain, our system performs well with unigram approach. We investigated our framework with pattern matching techniques, bigram technique, combining the unigram and bigram techiniques and incorporating parts of speech tagging, which have not fared as well as unigram techniques. We also investigated the reasons for the performance degradation or enhancements on each approach. Based on our analysis, we developed different applications to ease the use of our framework.

Type: MastersThesis

Publisher: University of Maryland, Baltimore County

Tags: blog, sentiment analysis system, opinions, trend analysis, blog

Google Scholar: FTq2J-8jE4MJ

Number of Google Scholar citations: 1 [show citations]

Number of downloads: 6044


Available for download as

size: 731022 bytes

size: 2675200 bytes