Workshop on the Evaluation of Natural Language Processing Systems

Martha Palmer; Tim Finin; Sharon M Walter

RADC-TR-89-302

Workshop on the Evaluation of Natural Language Processing Systems

Martha Palmer, Tim Finin, and Sharon M Walter

December 1, 1988

In the past few years, the computational linguistics research community has begun to wrestle with the problem of how to evaluate its progress in developing natural language processing systems. With the exception of natural language interfaces, there are few working systems in existence, and they tend to focus on very different tasks and equally different techniques. There has been little agreement in the field about training sets and test sets or about clearly defined subsets of problems that constitute standards for different levels of performance. Even those groups that have attempted a measure of self-evaluation have often been reduced to discussing a system's performance in isolation - comparing its current performance to its previous performance rather than to another system. As this technology begins to move slowly into the marketplace, the lack of useful evaluation techniques is becoming more and more painfully obvious. In order to make progress in the difficult area of natural language evaluation, a Workshop on the Evaluation of Natural Language Processing Systems was held in December of 1988 at the Wayne Hotel in Wayne, Pennsylvania.

920205 bytes

BibTeX OWL Tweet Scholar

Tags: evaluation, language understanding, natural language processing

Type: TechReport

Publisher: Air Force Systems Command

Organization: Rome Air Development Center

Note: RADC-TR-89-302, Final Technical Report, December 1989

Downloads: 334 downloads