UMBC ebiquity
Does AOL search data compromise privacy?

Does AOL search data compromise privacy?

Tim Finin, 1:00pm 6 August 2006

At the end of last week, AOL Research announced that it was releasing for research purposes several datasets from its search engine, including query streams for 500K users over three months. Adam D’Angelo points out that this could compromise the privacy of AOL users. The data has been anonymized, of course, by replacing user ids within a query session with a unique number. But some query streams might contain enough information to allow someone to make a good guess at the user’s identity. This sort of query data is one of the things that Google refused to provide the Department of Justice last Spring. On the other hand, Microsoft was offering researchers similar query data earlier this year. I think it’s a close call.

Update: As of 10:00pm Sunday night, the query stream data link is no longer there.

Related posts:

  1. AOL research releases Web search engine datasets
  2. Proposed: a consortium to address search query privacy policy
  3. Jiawei Han: Research Challenges In Data Mining, 10am 4/22 LH8 UMBC
  4. Analyzing AOL search data shows click through rates for search rank
  5. Faceted search for DBLP bibliographic data

Comments are closed.