AOL research releases Web search engine datasets

August 5th, 2006

AOL Research has released some interesting data collections, including:

  • 20K hand labeled, classified queries
  • 3.5M web Q/A queries (who, what, where, when …)
  • Query streams for 500K users over three months (20M queries)
  • Query arrival rates for queuing analysis
  • 2M queries against US Government domains

Additional datasets are promised in the future.

A paper describing some measurements over this (or related?) data is available: A Picture of Search by G. Pass, A. Chawdry and C. Torgeson.