UMBC ebiquity research group Building intelligent systems in open, heterogeneous, dynamic, distributed environments
05 September 2008, 03:27:52 EDT  
Semantic Web

Provenance Tracking in Science Data Processing Systems

May 28th, 2008, by Tim Finin, posted in Semantic Web

Maybe we should think of data provenance as being like a recipe. Recipes for preparing food are more than just a list of ingredients and specify, often in great detail, how the ingredients are combined, cooked and served and also specify the cooking implements and their settings.

Curt Tilmes presented his PhD dissertation proposal yesterday on “Provenance Tracking in Science Data Processing Systems”. Curt works at at the NASA Goddard Spaceflight Center and is responsible for managing the data processing of earth science climate research data. Curt has some very good ideas about how to capture all of the relevant provenance data for sophisticated scientific data. He’s using, of course, the Semantic Web languages (RDF and OWL) to express and share the provenance data.

Part of the problem is that you have to capture not just the inputs to a dataset, but how the inputs were processed to produce the dataset, including (ideally) the algorithms, software and hardware. As an easily grasped example to illustrate this, he referred to a recent post by Ray Pierre on the RealClimate blog, How to cook a graph in three easy lessons. This post demonstrates how Roy Spencer processes inputs from two common climate datasets (the Southern Oscillation and Pacific Decadal Oscillation indexes) to get the results that support the conclusion that global warming is due to natural causes and not human activity.

Faviki uses Wikipedia and DBpedia for semantic tagging

May 26th, 2008, by Tim Finin, posted in AI, Semantic Web, Social media

Faviki is a new social bookmarking system that uses Wikipedia articles for tags. It actually uses URLS in the DBpedia namespace that correspond to Wikipedia pages. The immediate benefits of this approach are several:

  • Users select tags from a large, common tag space. The ‘meaning’ of each tag ca be understood by reading the associated Wikipedia page. This makes it more likely that resources that share a tag, even if assigned by different people, are actually related.
  • Since the universe of tags is derived from Wikipedia, it is generated, kept current and maintained by a large and diverse set of people.
  • The tags have structured information associated with them and are part of broader-than, narrower-than lattice. It is not clear to me how much reasoning Faviki does with the linked data or when. But there is clearly a lot of potential here.
  • There is an opportunity to make the tagging system multi-lingual, since Wikipedia has articles in multiple languages and supports a way to link equivalent articles expressed in different languages.

The downside, of course, is that you lose the freedom and ease of most open tagging approaches — using the words and phrases that come immediately to mind.

The Faviki system is related to our own Wikitology project, which is exploring the use of using Wikipedia terms as an ontology, and also to Harry Chen’s Gnizer tagging system, which is an RDF-based social tagging system. Our current Wikitology work is focused on mapping text and entities from text into a set of terms derived from Wikipedia and salted with additional data from Dbpedia and Freebase.

One interesting research question is whether it’s possible to combine the ease of using user-generated tags with the power of mapping them into tags in a structured or semi-structured knowledge base.

Deriving knowledge bases from Wikipedia and using them in innovative is a very exciting topic that is sure to receive a lot of work in the coming years.

(spotted on ReadWriteWeb)

Int. Semantic Web Conf. workshop details

May 23rd, 2008, by Tim Finin, posted in Semantic Web, iswc

The 7th International Semantic Web Conference (ISWC) has an exciting program of thirteen one-day workshops that will be held on October 26 and 27. The deadlines for submitting papers vary. See the individual workshop pages for detailed information on their scope and structure and for information on submitting papers and participating.

The final scheduling of the workshops, assigning them to the 26th or 27th, has not yet been done.

PhD proposal: Context and Policies in Declarative Networked Systems

May 19th, 2008, by Tim Finin, posted in Semantic Web

UMBC PhD student Palanivel Kodeswaran will present his dissertation proposal on Use of Context and Policies in Declarative Networked Systems at 3:30 on Tuesday May 20 in ITE 325. Dissertation proposals are public and visitors are welcome. If you are a PhD student and are (or should be!) working on your own proposal, going to these is a good way to prepare. You can see what’s involved, what work and doesn’t and what kind of questions you can expect. See the link above for the full abstract, but here is a teaser.

“In this thesis, we propose to build a declarative framework that can reason over the requirements of applications, the current network context, operator policies, and appropriately configure the network to provide better network support for applications. … In particular, the contributions of this thesis are (i) Developing a framework for using context and policies in declarative networked systems (ii) Runtime adaptation of network configuration based on application requirements and node/operator policy (iii) Formalize cross layer interactions as opposed to ad hoc optimizations (iv) Simulation and test bed implementations to validate and evaluate proposed approach.”






UMBC