Cost-Sensitive Information Acquisition for Prediction


Wednesday, April 21, 2010, 12:00pm - Wednesday, April 21, 2010, 13:00pm

346 ITE


Machine learning systems have been increasingly used in our day-to-day activities now. Just a few examples include handwritten character recognition systems, product recommendation systems, face detection features of cameras, speech recognition in hands-free devices, document ranking by search engines, fraudulent activity detection for credit card transactions, spam detection, and medical diagnosis. A critical component of a machine learning system is the "information" needed to develop and use the systems. The speech and handwritten digit recognition systems need be trained on a representative and diverse set of examples, product recommendation systems need be provided with example ratings, emails need to be tagged as spam nor not, laboratory experiments need be run for medical diagnosis, etc.

Even though the necessary information can be freely available in some cases, gathering the information is costly (time, money, etc) in most of the cases; the users are willing to rate only few items, tag only few emails, and train the speech recognition algorithm with only few examples. It is essential to gather the user and expert feedback for the right examples and not waste their effort. A system that requires a tremendous amount of user input and labeled data, is impracticable, while a system that provides an unacceptable rate of incorrect predictions is useless if not harmful. It is thus imperative to develop systems that can provide correct predictions with the least amount of information and feedback possible.

In this talk, I will mainly talk about two techniques aimed at reducing the amount of information required to provide correct predictions. The techniques that I will present are based on decision theoretic analysis of value of information and predicting which examples the underlying model is most likely to be incorrect about. I'll also briefly talk about various interesting projects that I've been participating, such as information visualization and video analysis.

Marie desJardins

OWL Tweet

UMBC ebiquity