Joint Workshops at 49th International Conference on Very Large Data Bases (VLDBW’23) — TaDA’23: Tabular Data Analysis Workshop

Knowledge Graph-driven Tabular Data Discovery from Scientific Documents

, , , , , and

Synthesizing information from collections of tables embedded within scientific and technical documents is increasingly critical to emerging knowledge-driven applications. Given their structural heterogeneity, highly domain-specific content, and diffuse context, inferring a precise semantic understanding of such tables is traditionally better accomplished through linking tabular content to concepts and entities in reference knowledge graphs. However, existing tabular data discovery systems are not designed to adequately exploit these explicit, human-interpretable semantic linkages. Moreover, given the prevalence of misinformation, the level of confidence in the reliability of tabular information has become an important, often overlooked, factor in the discovery over open datasets. We describe a preliminary implementation of a discovery engine that enables table-based semantic search and retrieval of tabular information from a linked knowledge graph of scientific tables. We discuss the viability of semantics-guided tabular data analysis operations, including on-the-fly table generation under reliability constraints, within discovery scenarios motivated by intelligence production from documents.

  • 322569 bytes

  • 4412067 bytes

  • 2207815 bytes

data fusion, on-the-fly table generation, scientific tables, semantic tabular data discovery


CEUR Workshop Proceedings

Figure 2: System Architecture

Downloads: 93 downloads

UMBC ebiquity