UMBC ebiquity
PhD Proposal: Understanding the Logical and Semantic Structure of Large Documents

PhD Proposal: Understanding the Logical and Semantic Structure of Large Documents

Tim Finin, 8:54am 9 December 2016

business documents

Dissertation Proposal

Understanding the Logical and Semantic
Structure of Large DocumentsĀ 

Muhammad Mahbubur Rahman

11:00-1:00 Monday, 12 December 2016, ITE325b, UMBC

Up-to-the-minute language understanding approaches are mostly focused on small documents such as newswire articles, blog posts, product reviews and discussion forum entries. Understanding and extracting information from large documents such as legal documents, reports, business opportunities, proposals and technical manuals is still a challenging task. The reason behind this challenge is that the documents may be multi-themed, complex and cover diverse topics.

We aim to automatically identify and classify a document’s sections and subsections, infer their structure and annotate them with semantic labels to understand the semantic structure of a document. This document’s structure understanding will significantly benefit and inform a variety of applications such as information extraction and retrieval, document categorization and clustering, document summarization, fact and relation extraction, text analysis and question answering.

Committee: Drs. Tim Finin (Chair), Anupam Joshi, Tim Oates, Cynthia Matuszek, James Mayfield (JHU)


Comments are closed.