| UMBC ebiquity |
Data Provenance Management for Earth Science ReproducibilityTweetSpeaker: Curt Tilmes Start: Wednesday, March 24, 2010, 12:00PM End: Wednesday, March 24, 2010, 01:30PM Location: 325b ITE Abstract: A fundamental aspect of all science is reproducibility. In the past
few decades, Earth Science has been increasingly based on remote
sensing (aircraft, satellites, ocean buoy sensors, etc.) that have
produced tremendous volumes of data. There is often a long chain of
complex processing steps that ultimately lead to published science.
Understanding the processing chain, and maintaining scientific
reproducibility of results is a major challenge.
We are constructing a model of scientific data processing that captures
and maintains the provenance of all of the artifacts of processing.
These include the data transformation algorithms and all data in the
system, both inputs from external sources and data produced within the
system. Other artifacts include the hardware and software of the
processing framework, the source instruments and satellites,
scientific literature and documentation, and people and
organizations. The origin of any data or algorithms is recorded and
the entire history of the processing chains are stored such that a
researcher can understand the entire data flow. Provenance is
captured in a form suitable for the system to provide basic scientific
reproducibility of any data product it distributes even in cases where
the physical data products themselves have been deleted due to space constraints.
Assertions:
|