2007 TREC blog track
Tim Finin, 1:00pm 12 March 2007
The Text REtrieval Conference (TREC) is a series of workshops intended to encourage research within the IR community. Last year TREC featured a track on opinion extraction from blogs. Some details for the 2007 TREC blog track are now available. One task will be a based on the 2006 opinion extraction task, extended to identify polarity.
We propose to add a related subtask, namely a text classification-related task, requiring participants to determine the polarity (or orientation) of the opinions in the retrieved documents, namely whether the opinions are positive or negative.
In addition to a refined version of the 2006 TREC Blog opinion track, 2007 will also have a blog distillation (feed search) task intended to find blogs “with a principle, recurring interest” in a topic described by a query.
Blog search users often wish to identify blogs about a given topic, which they can subscribe to and read on a regular basis. This user task is most often manifested in two scenarios:
- Filtering: The user subscribes to a repeating search in their RSS reader.
- Distillation: The user searches for blogs with a recurring central interest, and then adds these to their RSS reader.
For TREC 2007, we are recommending that the TREC Blog track investigates the latter scenario – Blog Distillation. The Blog Distillation Task can be summarized as Find me a blog with a principle, recurring interest in X. For a given area X, systems should suggest feeds that are principally devoted to X over the timespan of the feed, and would be recommended to subscribe to as an interesting feed about the X (ie a user may be interested in adding it to their RSS reader).
Related posts: