Stochastic and Iterative Techniques for Relational Data Clustering

Monday, April 13, 2009, 10:30am - Monday, April 13, 2009, 1:00am

325b ITE

Dissertation Defense

This research focuses on the topic of relational data clustering, which is the task of organizing objects into logical groups, or clusters, taking into account the relational links between objects. As a research area, relational clustering has received a great deal of attention recently, because of the large variety of social media applications and other modern relational data sources that have become popular, such as weblogs, protein interaction networks, social networks, and citation graphs. The contributions of the dissertation are in three areas: probabilistic algorithms, iterative algorithms, and multi-relational algorithms. The probabilistic algorithms are presented as a general framework and allow for the highest level of expression in developing models and can discover the most novel data phenomena, while also subsuming several prior works. The iterative algorithm presented uses an objective function called block modularity and trades off expressiveness for speed and can be applied to much larger data sets, scaling up to several thousand objects. Finally, the multi-relational work focuses on identifying the most relevant relational information out of a larger set of different relation types. A summary of each algorithm and example applications of data analysis on social networks, a protein interaction network, a citation graph, and an international relations data set are discussed.

Committee Members

Dr. Marie desJardins (Chair)
Dr. Tim Finin
Dr. Tim Oates
Dr. Yun Peng
Dr. Lise Getoor

Hosted by: Marie desJardins

OWL Tweet