Proceedings of the 33rd International FLAIRS Conference

Gazetteer Generation for Neural Named Entity Recognition

, , , and

We present a way to generate gazetteers from the Wikidata knowledge graph and use the lists to improve a neural NER system by adding an input feature indicating that a word is part of a name in the gazetteer. We empirically show that the approach yields performance gains in two distinct languages: a high-resource, word-based language, English and a high-resource, character-based language, Chinese. We apply the approach to a low-resource language, Russian, using a new annotated Russian NER corpus from Reddit tagged with four core and eleven extended types, and show a baseline score.

  • 117117 bytes

natural language processing




A longer version of this paper is: Chan Hee Song, Dawn Lawrie, Tim Finin, James Mayfield, Improving Neural Named Entity Recognition with Gazetteers, arXiv:2003.03072, March 2020.

Downloads: 860 downloads

UMBC ebiquity