A Game Theoretic Framework for Distributed Multi-Party Privacy Preserving Data Mining

by

Monday, November 19, 2007, 10:00am - Monday, November 19, 2007, 12:00pm

Privacy protection is increasingly becoming an important issue in many data mining applications, particularly in the area of security and surveillance. However, privacy preserving data analysis is a non-trivial problem because of many reasons. First of all, privacy is a social concept. In most multi-party data mining scenarios participants have varying interests, objectives and expectations about data privacy. Enforcing a single model of privacy with strong assumptions regarding the behavior of the participants is often not practical.

In this research we propose a new approach toward privacy preserving data mining. We first describe a game-theoretic framework for multi-party privacy preserving data mining where each participant tries to maximize its benefit or utility score by optimally choosing the strategies for communication, computation and privacy breach during the execution of the protocol. The advantage of this framework over existing privacy models is two fold: (i) it does not bind participants to one uniform definition of privacy and gives them the freedom to define their own privacy criteria in their local objective functions and act accordingly, and (ii) it gets rid of some of the unrealistic assumptions regarding participant behavior that exist in most privacy preserving data mining algorithms till date.

In this research we also consider a real-life application for our privacy framework. We explore an interest based web community formation application that uses the user's browser cache data to determine if two users have similar browsing patterns. We develop a distributed top-k inner product identification algorithm for this and show how using an off-the-shelf privacy preserving technique such as homomorphic encryption degrades the performance of the algorithm and does away with the very essence of a heterogeneous concept of privacy in a distributed and completely decentralized algorithm. Future work includes complete development of the game theoretic framework, integration of various models of privacy in this framework, and development of the peer-to-peer web mining application using the Distributed Data MiningToolkit (DDMT) developed by the DIADIC laboratory at UMBC.

Hillol Kargupta

OWL Tweet

UMBC ebiquity