October 9th, 2018
DAbR: Dynamic Attribute-based Reputation Scoring for Malicious IP Address Detection
To effectively identify and filter out attacks from known sources like botnets, spammers, virus infected systems etc., organizations increasingly procure services that determine the reputation of IP addresses. Adoption of encryption techniques like TLS 1.2 and 1.3 aggravate this cause, owing to the higher cost of decryption needed for examining traffic contents. Currently, most IP reputation services provide blacklists by analyzing malware and spam records. However, newer but similar IP addresses used by the same attackers need not be present in such lists and attacks from them will get bypassed. In this paper, we present Dynamic Attribute based Reputation (DAbR), a Euclidean distance-based technique, to generate reputation scores for IP addresses by assimilating meta-data from known bad IP addresses. This approach is based on our observation that many bad IP’s share similar attributes and the requirement for a lightweight technique for reputation scoring. DAbR generates reputation scores for IP addresses on a 0-10 scale which represents its trustworthiness based on known bad IP address attributes. The reputation scores when used in conjunction with a policy enforcement module, can provide high performance and non-privacy-invasive malicious traffic filtering. To evaluate DAbR, we calculated reputation scores on a dataset of 87k IP addresses and used them to classify IP addresses as good/bad based on a threshold. An F-1 score of 78% in this classification task demonstrates our technique’s performance.
October 1st, 2018
The CCS Dashboard’s sections provide information on sources and targets of network events, file operations monitored and sub-events that are part
of the APT kill chain. An alert is generated when a likely complete APT is detected after reasoning over events.
Early Detection of Cybersecurity Threats Using Collaborative Cognition
The early detection of cybersecurity events such as attacks is challenging given the constantly evolving threat landscape. Even with advanced monitoring, sophisticated attackers can spend more than 100 days in a system before being detected. This paper describes a novel, collaborative framework that assists a security analyst by exploiting the power of semantically rich knowledge representation and reasoning integrated with different machine learning techniques. Our Cognitive Cybersecurity System ingests information from various textual sources and stores them in a common knowledge graph using terms from an extended version of the Unified Cybersecurity Ontology. The system then reasons over the knowledge graph that combines a variety of collaborative agents representing host and network-based sensors to derive improved actionable intelligence for security administrators, decreasing their cognitive load and increasing their confidence in the result. We describe a proof of concept framework for our approach and demonstrate its capabilities by testing it against a custom-built ransomware similar to WannaCry.
July 24th, 2018
Ontology-Grounded Topic Modeling for Climate Science Research
Jennifer Sleeman, Milton Halem and Tim Finin, Ontology-Grounded Topic Modeling for Climate Science Research
, Semantic Web for Social Good Workshop, Int. Semantic Web Conf., Monterey, Oct. 2018. (Selected as best paper), to appear, Emerging Topics in Semantic Technologies, E. Demidova, A.J. Zaveri, E. Simperl (Eds.), AKA Verlag Berlin, 2018.
In scientific disciplines where research findings have a strong impact on society, reducing the amount of time it takes to understand, synthesize and exploit the research is invaluable. Topic modeling is an effective technique for summarizing a collection of documents to find the main themes among them and to classify other documents that have a similar mixture of co-occurring words. We show how grounding a topic model with an ontology, extracted from a glossary of important domain phrases, improves the topics generated and makes them easier to understand. We apply and evaluate this method to the climate science domain. The result improves the topics generated and supports faster research understanding, discovery of social networks among researchers, and automatic ontology generation.
November 18th, 2017
Cybersecurity Challenges to American Local Governments
In this paper we examine data from the first ever nationwide survey of cybersecurity among American local governments. We are particularly interested in understanding the threats to local government cybersecurity, their level of preparedness to address the threats, the barriers these governments encounter when deploying cybersecurity, the policies, tools and practices that they employ to improve cybersecurity and, finally, the extent of awareness of and support for high levels of cybersecurity within their organizations. We found that local governments are under fairly constant cyberattack and are periodically breached. They are not especially well prepared to prevent cyberattacks or to recover when breached. The principal barriers to local cybersecurity are financial and organizations. Although a number of policies, tools and practices to improve cybersecurity, few local governments are making wide use of them. Last, local governments suffer from too little awareness of and support for cybersecurity within their organizations.
May 15th, 2017
Jennifer Sleeman, Milton Halem, Tim Finin, and Mark Cane, Modeling the Evolution of Climate Change Assessment Research Using Dynamic Topic Models and Cross-Domain Divergence Maps, AAAI Spring Symposium on AI for Social Good, AAAI Press, March, 2017.
Climate change is an important social issue and the subject of much research, both to understand the history of the Earth’s changing climate and to foresee what changes to expect in the future. Approximately every five years starting in 1990 the Intergovernmental Panel on Climate Change (IPCC) publishes a set of reports that cover the current state of climate change research, how this research will impact the world, risks, and approaches to mitigate the effects of climate change. Each report supports its findings with hundreds of thousands of citations to scientific journals and reviews by governmental policy makers. Analyzing trends in the cited documents over the past 30 years provides insights into both an evolving scientific field and the climate change phenomenon itself. Presented in this paper are results of dynamic topic modeling to model the evolution of these climate change reports and their supporting research citations over a 30 year time period. Using this technique shows how the research influences the assessment reports and how trends based on these influences can affect future assessment reports. This is done by calculating cross-domain divergences between the citation domain and the assessment report domain and by clustering documents between domains. This approach could be applied to other social problems with similar structure such as disaster recovery.
May 13th, 2017
Sudip Mittal, Aditi Gupta, Karuna Pande Joshi, Claudia Pearce and Anupam Joshi, A Question and Answering System for Management of Cloud Service Level Agreements, IEEE International Conference on Cloud Computing, June 2017.
One of the key challenges faced by consumers is to efficiently manage and monitor the quality of cloud services. To manage service performance, consumers have to validate rules embedded in cloud legal contracts, such as Service Level Agreements (SLA) and Privacy Policies, that are available as text documents. Currently this analysis requires significant time and manual labor and is thus inefficient. We propose a cognitive assistant that can be used to manage cloud legal documents by automatically extracting knowledge (terms, rules, constraints) from them and reasoning over it to validate service performance. In this paper, we present this Question and Answering (Q&A) system that can be used to analyze and obtain information from the SLA documents. We have created a knowledgebase of Cloud SLAs from various providers which forms the underlying repository of our Q&A system. We utilized techniques from natural language processing and semantic web (RDF, SPARQL and Fuseki server) to build our framework. We also present sample queries on how a consumer can compute metrics such as service credit.