ACM International Workshop on Privacy and Annonymity for Very Large Datasets

Applying Differential Privacy to Search Queries in a Policy Based Interactive Framework


Web search logs are of growing importance to researchers as they help understanding search behavior and search engine performance. However, search logs typically contain sensitive information about users and therefore considerable caution must be exercised when considering releasing the logs to the research community. Current approaches to releasing search logs focus on either protecting the privacy of users or enhancing the utility of data to researchers. In this work, we address the privacy-utility tradeoff by providing safe access to search logs, instead of releasing them. We propose a policy based safe interactive framework built on semantic policies and differential privacy to allow researchers access to search logs, while maintaining the privacy of the users. Semantic policies are used to infer the higher levels of information that can be mined from a dataset based on the fields accessed by a researcher. The accessed fields are then used to build research profile(s) that guide the amount of privacy to be enforced using differential privacy. We show the additional utility that can be obtained in our framework by two demonstrative experiments that involve access to user level information. Our results indicate that valid research can be conducted in our framework without forgoing the privacy of individuals.

  • 407606 bytes

differential privacy, semantic policy




Downloads: 2579 downloads

Google Scholar Citations: 2 citations

UMBC ebiquity