UMBC ebiquity

Data Clustering with a Relational Push-Pull Model

Speaker: Adam Anthony

Start: Monday, April 30, 2007, 10:00AM

End: Monday, April 30, 2007, 12:00PM

Location: 325b ITE

Abstract: Relational data clustering is the task of grouping data objects together when both features and relations between objects are present. I present a new generative model for relational data in which relations between objects can have either a binding or separating effect. For example, with a group of students separated into gender clusters, a "dating" relation would appear most frequently between the clusters, but a "roommate" relation would appear more often within clusters. In visualizing these relations, one can imagine that the "dating" relation effectively pushes clusters apart, while the "roommate" relation pulls clusters into tighter formations. I use simulated annealing to search for optimal values of the unknown model parameters, where the objective function is a Bayesian score derived from the generative model. Specifically, I show that an assumption that relations should most frequently appear within clusters can lead to poor performance, using experiments with artificial data and two real-world data sets: a Hollywood actor database and an ecological food web. The experiments show that push-type relations do exist, and therefore the tendency of relations to pull clusters together cannot be assumed in general.

Tags: learning

Host: Marie desJardins