Information Extraction via Automatic Generation of Semantic Classifiers
by Zareen Syed
Tuesday, September 16, 2008, 10:30am - Tuesday, September 16, 2008, 12:00pm
This talk introduces a new "model" to generate training data with least manual intervention. Our approach uses structured data available in Encarta (Encyclopedia) to generate the training data. Encarta articles are categorized and linked to related articles by experts. We harvest the structured data available in Encarta and use it in an intuitive way for automatic generation of classifiers. The classifiers were employed on the following information extraction tasks:
- Entity Classification
- Entity Clustering
- Relation Extraction
The talk will also cover the challenges faced in using the Encarta and MindNet resources and give an overview of promising future work directions.