UMBC ebiquity

Generating Linked Data by Inferring the Semantics of Tables

Authors: Varish Mulwad, Tim Finin, and Anupam Joshi

Book Title: Proceedings of the First International Workshop on Searching and Integrating New Web Data Sources

Date: September 03, 2011

Abstract: Vast amounts of information is encoded in structured tables found in documents, on the Web, and in spreadsheets or databases. Integrating or searching over this information benefits from understanding its intended meaning. Evidence for a table's meaning can be found in its column headers, cell values, implicit relations between columns, caption and surrounding text but also requires general and domain-specific background knowledge. We represent a table's meaning by mapping columns to classes in an appropriate ontology, linking cell values to literal constants, implied measurements, or entities in the linked data cloud (existing or new) and discovering or and identifying relations between columns. We describe techniques grounded in graphical models and probabilistic reasoning to infer meaning (semantics) associated with a table. Using background knowledge from the Linked Open Data cloud, we jointly infer the semantics of column headers, table cell values (e.g.,strings and numbers) and relations between columns and represent the inferred meaning as graph of RDF triples. We motivate the value of this approach using tables from the medical domain, discussing some of the challenges presented by these tables and describing techniques to tackle them.

Type: InProceedings

Note: Co-located with VLDB 2011

Google Scholar: search

Number of downloads: 1493

 

Available for download as


size: 461328 bytes

size: 1272397 bytes
 

Related Projects:

Past Project

 Tables to Linked Data.