Blogvox2: A Modular Domain Independent Sentiment Analysis System

by

Thursday, June 7, 2007, 13:00pm - Thursday, June 7, 2007, 15:00pm

325b ITE

blog, opinions, sentiment analysis system, trend analysis

Bloggers make a huge impact on society by representing and Influencing the people. Blogging by nature is about expressing and listening to opinion, the job of a politician is to both represent and lead the people, good sentiment detection tools, for blogs and other social media, tailored to politics are a must for today's society. With the elections around the corner, political blogs are vital to exerting and keeping political influence over society. Currently, no sentiment analysis framework that is tailored to Political Blogs exists. Hence, a modular framework built with replicable modules for the analysis of sentiment in blogs tailored to political blogs is thus justified.

In this paper, I propose Blogvox2: an information retrieval based modular domain independent sentiment analysis framework that uses customized pattern matching techniques, nave Bayesian filter, bag of words and part of speech tagging techniques for opinion extraction in blogs. We also developed prototype two-panel and four-panel search applications of the query results. Also, trends on the hot and top topics on the opinionated sentences are analyzed.

By this framework, the benefits of Blogvox2 we created a modular approach provides a platform where new modules for different domains can be easily plugged in. The framework provides the date of publishing, permanent link and the URLs of the sentences that expresses opinions based on the analysis. Additionally, tools for trend analysis for obtaining the hot and top topic identification graphics based on the obtained opinionated sentences for presented.

Based on the analysis of the blogvox2 on political domain, our system performs well with Unigram approach. We investigated our framework with pattern matching techniques, bigram techniques, and incorporating parts of speech tagging, which haven't fared as well as unigram techniques, although combining the unigram and bigram techniques performed similar to the unigram approach. We also investigated the reasons for the performance degradation or enhancements on each approach. Based on our analysis, we also developed different applications the ease of using the framework.

OWL Tweet

UMBC ebiquity