UMBC ebiquity

Text Based Similarity and Delta for Semantic Web Graphs

Authors: Krishnamurthy Viswanathan, and Tim Finin

Date: August 07, 2010

Abstract: Recognizing that two Semantic Web documents or graphs are similar, and characterizing their differences is useful in many tasks, including retrieval, updating, version control and knowledge base editing. We describe a number of text based similarity metrics that characterize the relation between Semantic Web graphs and evaluate these metrics for three specific cases of similarity that we have identified: similarity in classes and properties used while differing only in literal content, difference only in base-URI, and versioning relationship. When one graph is judged to be a version of another, we generate a “delta” consisting of of triples to be added or removed from one graph to make them equivalent. This method takes into account the text of the RDF graph’s serialization as a document, rather than relying solely on the document URI. We have prototyped these techniques in a system that we call Similis and evaluated its performance on several tasks using a collection of graphs from the archive of the Swoogle Semantic Web search engine.

Type: TechReport

Address: 1000 Hilltop Circle

Organization: Computer Science and Electrical Engineering

Institution: University of Maryland, Baltimore County

Tags: semantic web graphs, similarity metrics, delta, semantic web, rdf, information retrieval

Google Scholar: search

Number of downloads: 1716


Available for download as

size: 336777 bytes