UMBC ebiquity

T2LD – An automatic framework for extracting, interpreting and representing tables as linked data

Speaker: Varish Mulwad

Start: Tuesday, June 29, 2010, 09:30AM

End: Tuesday, June 29, 2010, 11:30AM

Location: ITE 325 - B

Abstract:

MS Thesis Defense

We present an automatic framework for extracting, interpreting and generating linked data from tables. In the process of representing tables as linked data, we assign every column header a class label from an appropriate ontology, link table cells (if appropriate) to an entity from the Linked Open Data cloud and identify relations between various columns in the table, which helps us to build an overall interpretation of the table. Using the limited evidence provided by a table in the form of table headers and table data in rows and columns, we adopt a novel approach of querying existing knowledge bases such as Wikitology and DBpedia to figure the class labels for table headers. In the process of entity linking, besides querying knowledge bases, we use machine learning algorithms like SVM and SVM-rank which can learn to rank entities within a given set to link a table cell to entity. We further use the class labels, linked entities and information from the knowledge bases to identify relations between columns. We prototyped a system to evaluate our approach against tables obtained from Google Squared, Wikipedia and tables obtained from a dataset which Google shared with us.


Committee Members:
  • Dr. Tim Finin (Chair)
  • Dr. Anupam Joshi
  • Dr. Tim Oates
  • Dr. Evelyne Viegas (Microsoft Research)

Tags: semantic web, linked data

Host: Tim Finin

 

Assertions:

  1. (Event) T2LD – An automatic framework for extracting, interpreting and representing tables as linked data has the associated publication (Publication) T2LD - An automatic framework for extracting, interpreting and representing tables as Linked Data
,