Ph.D. Dissertation Defense
Modeling and Extracting Information about Cybersecurity Events from Text
9:30-11:30 Monday, 18 November, 2019, ITE346?
People now rely on the Internet to carry out much of their daily activities such as banking, ordering food, and socializing with their family and friends. The technology facilitates our lives, but also comes with many problems, including cybercrimes, stolen data, and identity theft. With the large and increasing number of transactions done every day, the frequency of cybercrime events is also growing. Since the number of security-related events is too high for manual review and monitoring, we need to train machines to be able to detect and gather data about potential cyber threats. To support machines that can identify and understand threats, we need standard models to store the cybersecurity information and information extraction systems that can collect information to populate the models with data from text.
This dissertation makes two significant contributions. First, we defined rich cybersecurity event schema and annotated the news corpus following the schema. Our schema consists of event type definitions, semantic roles, and event arguments. Second, we present CASIE, a cybersecurity event extraction system. CASIE can detect cybersecurity events, identify event participants and their roles, including specifying realis values. It also groups the events, which are coreference. CASIE produces output in easy to use format as a JSON object.
We believe that this dissertation will be useful for cybersecurity management in the future. It will quickly grasp cybersecurity event information out of the unstructured text and fill in the event frame. So we can compete with tons of cybersecurity events that happen every day.
Committee: Drs. Tim Finin (chair), Anupam Joshi, Tim Oates, Karuna Pande Joshi, Francis Ferraro