Entity Disambiguation for Wild Big Data Using Multi-Level Clustering

Jennifer Sleeman

Doctoral Consortium, 14th International Semantic Web Conference

Entity Disambiguation for Wild Big Data Using Multi-Level Clustering

October 12, 2015

When RDF instances represent the same entity they are said to corefer. For example, two nodes from different RDF graphs 1 both refer to same individual, musical artist James Brown. Disambiguating entities is essential for knowledge base population and other tasks that result in integration or linking of data. Often however, entity instance data originates from different sources and can be represented using differ- ent schemas or ontologies. In the age of Big Data, data can have other characteristics such originating from sources which are schema-less or without ontological structure. Our work involves researching new ways to process this type of data in order to perform entity disambiguation. Our approach uses multi-level clustering and includes fine-grained entity type recognition, contextualization of entities, online processing of which can be supported by a parallel architecture.

197805 bytes

BibTeX OWL Tweet Scholar

Tags: clustering, coference, entity disambiguation, entity type recognition, lda, topic modeling

Type: InProceedings

Publisher: ISWC 2015 Doctoral Consortium CEUR Proceeding

Downloads: 967 downloads