Joint Workshops at 49th International Conference on Very Large Data Bases (VLDBW’23) — TaDA’23: Tabular Data Analysis Workshop
Knowledge Graph-driven Tabular Data Discovery from Scientific Documents
September 1, 2023
Synthesizing information from collections of tables embedded within scientific and technical documents is increasingly critical to emerging knowledge-driven applications. Given their structural heterogeneity, highly domain-specific content, and diffuse context, inferring a precise semantic understanding of such tables is traditionally better accomplished through linking tabular content to concepts and entities in reference knowledge graphs. However, existing tabular data discovery systems are not designed to adequately exploit these explicit, human-interpretable semantic linkages. Moreover, given the prevalence of misinformation, the level of confidence in the reliability of tabular information has become an important, often overlooked, factor in the discovery over open datasets. We describe a preliminary implementation of a discovery engine that enables table-based semantic search and retrieval of tabular information from a linked knowledge graph of scientific tables. We discuss the viability of semantics-guided tabular data analysis operations, including on-the-fly table generation under reliability constraints, within discovery scenarios motivated by intelligence production from documents.
See paper, slides, poster, and presentation video.
InProceedings
CEUR Workshop Proceedings
Figure 2: System Architecture
Downloads: 718 downloads