Doctoral Consortium, 14th International Semantic Web Conference

Entity Disambiguation for Wild Big Data Using Multi-Level Clustering

When RDF instances represent the same entity they are said to corefer. For example, two nodes from different RDF graphs 1 both refer to same individual, musical artist James Brown. Disambiguating entities is essential for knowledge base population and other tasks that result in integration or linking of data. Often however, entity instance data originates from different sources and can be represented using differ- ent schemas or ontologies. In the age of Big Data, data can have other characteristics such originating from sources which are schema-less or without ontological structure. Our work involves researching new ways to process this type of data in order to perform entity disambiguation. Our approach uses multi-level clustering and includes fine-grained entity type recognition, contextualization of entities, online processing of which can be supported by a parallel architecture.


  • 197805 bytes

clustering, coference, entity disambiguation, entity type recognition, lda, topic modeling

InProceedings

ISWC 2015 Doctoral Consortium CEUR Proceeding

Downloads: 1020 downloads

UMBC ebiquity