A New Approach for Automatic Thesaurus Generation

Speaker: Lushan Han

Start: Tuesday, May 04, 2010, 10:15AM

End: Tuesday, May 04, 2010, 11:15AM

Location: ITE 325B

Abstract: Distributional similarity has long been used to determine how similar two words are and has been used in automatic thesaurus generation. Such distribution similarity measures, however, do not always work well for finding synonyms in a text corpus because synonyms may not necessarily have the most similar contexts. We have developed a novel alternative approach in automatic thesaurus generation using pointwise mutual information (PMI) and by exploiting co-occurrence patterns of synonyms, which yields a competitive performance compared with the distribution similarity and gives room for further improvement by combining the two approaches.

Tags: automatic thesaurus generation, pmi, synonyms

Host: Tim Finin