A New Approach for Automatic Thesaurus Generation
by Lushan Han
Tuesday, May 4, 2010, 10:15am - Tuesday, May 4, 2010, 11:15am
Distributional similarity has long been used to determine how similar two words are and has been used in automatic thesaurus generation. Such distribution similarity measures, however, do not always work well for finding synonyms in a text corpus because synonyms may not necessarily have the most similar contexts. We have developed a novel alternative approach in automatic thesaurus generation using pointwise mutual information (PMI) and by exploiting co-occurrence patterns of synonyms, which yields a competitive performance compared with the distribution similarity and gives room for further improvement by combining the two approaches.