Text Based Similarity Metrics and Deltas for Semantic Web Graphs
Tuesday, October 5, 2010, 11:00am - Tuesday, October 5, 2010, 12:00pm
ITE 325
Recognizing that two Semantic Web documents or graphs are similar and characterizing their differences is useful in many tasks, including retrieval, updating, version control and knowledge base editing. I will describe several text-based similarity metrics that characterize the relation between Semantic Web graphs and evaluate these metrics for three specific cases of similarity: similarity in classes and properties, similarity disregarding differences in base-URIs, and versioning relationship. I've applied these techniques for a specific use case – generating a delta between versions of a Semantic Web graph. The system has been evaluated on several tasks using a collection of graphs from the archive of the Swoogle Semantic Web search engine.