Knowledge Graphs and Reinforcement Learning: A Hybrid Approach for Cybersecurity Problems

With the explosion of available data and computational power, machine learning and deep learning techniques are being increasingly used to solve problems. The domain of cybersecurity is no exception, as we have seen multiple papers in the recent past using data-driven machine learning approaches for different tasks. Rule-based and supervised machine learning-based approaches are often brittle in detecting attacks, can be defeated by adversaries that adapt, and cannot use the knowledge of experts. To address this problem, we propose a novel approach for cybersecurity tasks that leverages the supply of ‘explicit’ knowledge expressed as Knowledge Graphs, as well as the data-driven approaches of machine learning to ascertain the ‘tacit’ knowledge. The inspiration for this hybrid model is drawn from the manner in which security analysts work, combining their background knowledge with observed data from host and network-based sensors. First, we work on extracting background knowledge from unstructured textual descriptors of malware. Next, we focus on the tasks of malware detection and mitigation policy generation. We are specifically interested in the capability of synthesizing novel ways in which a malware may carry out an attack that can be further used to detect unknown attacks. We combine the background or explicit knowledge with explorations in the “action space” of malware detection and malware mitigation agents. Analysts take response strategies for novel malware based on their knowledge of past experiences, while also exploring new strategies through “trial and error”. In this dissertation, we describe knowledge graph construction techniques from open-source text as well as several Reinforcement Learning (RL) based algorithms, guided by explicit background knowledge encoded in a knowledge graph, that best mimics the approach of security analysts. We observe that the efficiency of RL algorithms has increased by 4% by incorporating prior knowledge. We also observe that in simulated environments RL policies are able to generate more precise mitigation strategies with the help of prior knowledge. Certain parameters that measure the precision of mitigation actions, such as network availability, show an increase of 50% when prior knowledge is used.

  • 4112159 bytes


University of Maryland, Baltimore County

Downloads: 245 downloads

UMBC ebiquity