Semantic Discovery: Discovering Complex Relationships in Semantic Web
Status: Past project
Our research will focus on the design, prototyping and evaluation of a system, called SemDIS (Semantic Discovery) that supports indexing and querying of complex semantic relationships and is driven by notions of information trust and provenance and models of hypotheses and arguments under investigation.
From scientific perspective, we face the challenges of formally defining and representing meaningful and interesting relationships (which we call semantic associations), and defining the notion of quality of results similar to the familiar metrics of precision, recall and document ranking. Another challenge is the (semi) automatic construction of argument structures built on these relationships to validate or deny a given hypothesis. Additional scientific and engineering challenges include those related to the scale of storing and complex query processing of large metadata sets, with corresponding more complex data structures to represent entities and relationships, the need to utilize context to select relevant subsets of metadata to process, and new techniques that use information provenance and trust to improve ranking of relationships. These challenges call for a fresh look at indexing, query processing, ranking, as well as tractable and scalable graph algorithms that exploit heuristics. Our work proposes to address these challenges building on our preliminary results in semantic metadata extraction, practical domain-specific ontology creations, defining semantic associations, main-memory query processing, using distributed trust to enforce security policies, and knowledge representation and reasoning on the semantic web. Scientific results from SemDIS will involve detailed scenarios and an evaluation testbed, and will be measured in terms of novel techniques as well as performance metrics and measures of quality, scalability and performance for computing complex semantic relationships. Corresponding to the breadth and depth of the topics involved in the challenge undertaken, ours is a collaborative proposal involving researchers at UGA and UMBC, covering the areas of information modeling and knowledge representation, storage and database management, information retrieval and artificial intelligence.
Our effort will have broader impacts beyond the education and training of graduate students, and the publication of research findings. Results from our research will be integrated with courses we teach, both existing and new. We will use institutional mechanisms in place to seek participation of students from underrepresented groups. Datasets used for testbed evaluations and some of the targeted tools will be made public or open source, and new measures for relevance and ranking of semantic associations will provide input to future work on comparing various approaches and techniques. Our work will also gain from several academic-industry collaborations of the investigators. We will have the opportunity to leverage commercial infrastructure and raw metadata provided by Semagix and IBM and, when appropriate, technology licensing will be encouraged. The researchers will collaborate with industry, and the students will be encouraged to intern at collaborating industrial labs. Within a broader social context, emerging knowledge-centric technologies raise legitimate privacy and civil liberties concerns. Building upon past policy making experience, we will comment on potential implications of our scientific progress. This research is supported in part by an NSF award ITR 0325172, and is a collaborative effort with colleagues at U. Georgia and Wright State University . Some demos associated with our efforts to find trust using DBLP and FOAF data can be found at the project's web page. We've also developed related software systems, supported by this and other awards, such as a semantic web search engine Swoogle, Community/sentiment/trust detection systems such as Feeds that matter and PolVox, amongst others.
Start Date: October 2003
End Date: October 2008
There are 37 associated publications:
There is 1 associated resource: Hide the list...
1. SEMDIS poster (June 2004), Poster.