Reuters Calais to support Semantic Web Linked Data in next release

November 14th, 2008

Thompson Reuters announced on their blog (Life in the Linked Data Cloud: Calais Release 4) that their next release of the Calais web-based information extraction services will support linked data.

“In that release we’ll go beyond the ability to extract semantic data from your content. We will link that extracted semantic data to datasets from dozens of other information sources, from Wikipedia to Freebase to the CIA World Fact Book. In short – instead of being limited to the contents of the document you’re processing, you’ll be able to develop solutions that leverage a large and rapidly growing information asset: the Linked Data Cloud.”

The new capabilities will be available in release 4 that is expected
out on 09 January 2009.

The change is based on Calais returning de-referenceable URIs for the entities it finds. Accessing those URIs will produce RDF with links to corresponding entities in DBpedia, Freebase and other sources of “Semantic Web” data. It will be very interesting to see how well their system does at mapping document entities (e.g., “secretary Rice”) to entities in the LOD cloud such as Accessing that URI with a request for content type application/rdf+xml returns the RDF at that has RDF assertions extracted by DBpedia from Wikipedia.