PhD proposal: On Boosting Semantic Web Data Access

Speaker: Li Ding

Start: Wednesday, January 19, 2005, 03:00PM

Location: ITE 325B

Abstract: The Semantic Web can be viewed as a collection of RDF graphs serialized by RDF documents that distributed in the Web. Its utility depends on three issues: availability (existence of data), accessibility (users can retrieve the data they want), and quality (users can judge the quality of the retrieved data). While more data are available in the Semantic Web, the latter two issues are often ignored or circumscribed due to lacking of tools and mechanisms. This dissertation proposes an ontology-based approach to these two issues so as to boost the utility of the Semantic Web. For accessibility, we identified three critical challenges: i) there are few links to (and almost no description about) RDF documents; ii) it is hard to query the Semantic Web since users are not familiar with semantic web vocabulary (i.e. the URIrefs) with over 150,000 unique entries; and iii) it is unrealistic to query the entire Semantic Web without effective data access service. In order to address these challenges, we proposed the Web of Belief (WOB) ontology to model the Semantic Web and its context (i.e. the web and the agent world), and developed {em Swoogle} system that digests and searches semantic web data using WOB ontology. In particular, Swoogle helps publishers by ranking properties of a given class, and supports information consumers by estimating query complexity and searching URLs of relevant RDF documents. For quality, we first clarified the dimensions of quality (e.g. consistency, completeness, precision, importance and trustworthiness) for different concepts in WOB (e.g. RDF graph, web page, and agent). We then proposed the quality extension to WOB ontology for representing users' quality judgments (esp. trust judgments) explicitly. Finally, we proposes a series of semantic web navigation models and corresponding ranking algorithms for ontological terms and RDF documents, and a series of algorithms for evaluating trustworthiness of a given RDF graph according to the availability of background knowledge. The contributions of this dissertation are the following: i) WOB ontology, which is one of the first attempts that make the Semantic Web self-descriptive in OWL semantics; ii) Swoogle, which is one of the first web-scale data access services that digest and search the Semantic Web; iii) semantic web navigation models and ranking algorithms; and iv) RDF graph trustworthiness evaluation mechanisms. The WOB ontology and Swoogle like systems, we believe, will bring emergent properties to the Semantic Web: the utility of the web-scale Semantic Web will be reinforced when users are less hassled in finding useful data and are more aware of data quality.

Tags: semantic web, ontology, swoogle, search, information retrieval, rdf, owl



