paper: Knowledge Graph Fact Prediction via Knowledge-Enriched Tensor Factorization

February 7th, 2019

Knowledge Graph Fact Prediction via
Knowledge-Enriched Tensor Factorization

Ankur Padia, Kostantinos Kalpakis, Francis Ferraro and Tim Finin, Knowledge Graph Fact Prediction via Knowledge-Enriched Tensor Factorization, Journal of Web Semantics, to appear, 2019

We present a family of novel methods for embedding knowledge graphs into real-valued tensors. These tensor-based embeddings capture the ordered relations that are typical in the knowledge graphs represented by semantic web languages like RDF. Unlike many previous models, our methods can easily use prior background knowledge provided by users or extracted automatically from existing knowledge graphs. In addition to providing more robust methods for knowledge graph embedding, we provide a provably-convergent, linear tensor factorization algorithm. We demonstrate the efficacy of our models for the task of predicting new facts across eight different knowledge graphs, achieving between 5% and 50% relative improvement over existing state-of-the-art knowledge graph embedding techniques. Our empirical evaluation shows that all of the tensor decomposition models perform well when the average degree of an entity in a graph is high, with constraint-based models doing better on graphs with a small number of highly similar relations and regularization-based models dominating for graphs with relations of varying degrees of similarity.

paper: Quantum Annealing Based Binary Compressive Sensing with Matrix Uncertainty

January 13th, 2019

Quantum Annealing Based Binary Compressive Sensing with Matrix Uncertainty

Ramin Ayanzadeh, Seyedahmad Mousavi, Milton Halem and Tim Finin, Quantum Annealing Based Binary Compressive Sensing with Matrix Uncertainty, arXiv:1901.00088 [cs.IT], 1 January 2019.

Compressive sensing is a novel approach that linearly samples sparse or compressible signals at a rate much below the Nyquist-Shannon sampling rate and outperforms traditional signal processing techniques in acquiring and reconstructing such signals. Compressive sensing with matrix uncertainty is an extension of the standard compressive sensing problem that appears in various applications including but not limited to cognitive radio sensing, calibration of the antenna, and deconvolution. The original problem of compressive sensing is NP-hard so the traditional techniques, such as convex and nonconvex relaxations and greedy algorithms, apply stringent constraints on the measurement matrix to indirectly handle this problem in the realm of classical computing.

We propose well-posed approaches for both binary compressive sensing and binary compressive sensing with matrix uncertainty problems that are tractable by quantum annealers. Our approach formulates an Ising model whose ground state represents a sparse solution for the binary compressive sensing problem and then employs an alternating minimization scheme to tackle the binary compressive sensing with matrix uncertainty problem. This setting only requires the solution uniqueness of the considered problem to have a successful recovery process, and therefore the required conditions on the measurement matrix are notably looser. As a proof of concept, we can demonstrate the applicability of the proposed approach on the D-Wave quantum annealers; however, we can adapt our method to employ other modern computing phenomena–like adiabatic quantum computers (in general), CMOS annealers, optical parametric oscillators, and neuromorphic computing.

paper: DAbR: Dynamic Attribute-based Reputation scoring for Malicious IP Address Detection

October 9th, 2018

DAbR: Dynamic Attribute-based Reputation Scoring for Malicious IP Address Detection

Arya Renjan, Karuna Pande Joshi, Sandeep Nair Narayanan and Anupam Joshi, DAbR: Dynamic Attribute-based Reputation Scoring for Malicious IP Address Detection, IEEE Intelligence and Security Informatics, November 2018.


To effectively identify and filter out attacks from known sources like botnets, spammers, virus infected systems etc., organizations increasingly procure services that determine the reputation of IP addresses. Adoption of encryption techniques like TLS 1.2 and 1.3 aggravate this cause, owing to the higher cost of decryption needed for examining traffic contents. Currently, most IP reputation services provide blacklists by analyzing malware and spam records. However, newer but similar IP addresses used by the same attackers need not be present in such lists and attacks from them will get bypassed. In this paper, we present Dynamic Attribute based Reputation (DAbR), a Euclidean distance-based technique, to generate reputation scores for IP addresses by assimilating meta-data from known bad IP addresses. This approach is based on our observation that many bad IP’s share similar attributes and the requirement for a lightweight technique for reputation scoring. DAbR generates reputation scores for IP addresses on a 0-10 scale which represents its trustworthiness based on known bad IP address attributes. The reputation scores when used in conjunction with a policy enforcement module, can provide high performance and non-privacy-invasive malicious traffic filtering. To evaluate DAbR, we calculated reputation scores on a dataset of 87k IP addresses and used them to classify IP addresses as good/bad based on a threshold. An F-1 score of 78% in this classification task demonstrates our technique’s performance.

paper: Early Detection of Cybersecurity Threats Using Collaborative Cognition

October 1st, 2018

The CCS Dashboard’s sections provide information on sources and targets of network events, file operations monitored and sub-events that are part of the APT kill chain. An alert is generated when a likely complete APT is detected after reasoning over events.

The CCS Dashboard’s sections provide information on sources and targets of network events, file operations monitored and sub-events that are part
of the APT kill chain. An alert is generated when a likely complete APT is detected after reasoning over events.

Early Detection of Cybersecurity Threats Using Collaborative Cognition

Sandeep Narayanan, Ashwinkumar Ganesan, Karuna Joshi, Tim Oates, Anupam Joshi and Tim Finin, Early detection of Cybersecurity Threats using Collaborative Cognition, 4th IEEE International Conference on Collaboration and Internet Computing, Philadelphia, October. 2018.


The early detection of cybersecurity events such as attacks is challenging given the constantly evolving threat landscape. Even with advanced monitoring, sophisticated attackers can spend more than 100 days in a system before being detected. This paper describes a novel, collaborative framework that assists a security analyst by exploiting the power of semantically rich knowledge representation and reasoning integrated with different machine learning techniques. Our Cognitive Cybersecurity System ingests information from various textual sources and stores them in a common knowledge graph using terms from an extended version of the Unified Cybersecurity Ontology. The system then reasons over the knowledge graph that combines a variety of collaborative agents representing host and network-based sensors to derive improved actionable intelligence for security administrators, decreasing their cognitive load and increasing their confidence in the result. We describe a proof of concept framework for our approach and demonstrate its capabilities by testing it against a custom-built ransomware similar to WannaCry.

talk: Design and Implementation of an Attribute Based Access Controller using OpenStack Services

September 23rd, 2018

Design and Implementation of an Attribute Based Access Controller using OpenStack Services

Sharad Dixit, Graduate Student, UMBC
10:30am Monday, 24 September 2018, ITE346

With the advent of cloud computing, industries began a paradigm shift from the traditional way of computing towards cloud computing as it fulfilled organizations present requirements such as on-demand resource allocation, lower capital expenditure, scalability and flexibility but with that it brought a variety of security and user data breach issues. To solve the issues of user data and security breach, organizations have started to implement hybrid cloud where underlying cloud infrastructure is set by the organization and is accessible from anywhere around the world because of the distinguishable security edges provided by it. However, most of the cloud platforms provide a Role Based Access Controller which does not adequate for complex organizational structures. A novel mechanism is proposed using OpenStack services and semantic web technologies to develop a module which evaluates user’s and project’s multi-varied attributes and run them against access policy rules defined by an organization before granting the access to the user. Henceforth, an organization can deploy our module to obtain a robust and trustworthy access control based on multiple attributes of a user and the project the user has requested in a hybrid cloud platform like OpenStack.

AAAI Symposium on Privacy-Enhancing AI and HLT Technologies

July 31st, 2018

PAL: Privacy-Enhancing AI and Language Technologies

AAAI Spring Symposium
25-27 March 2019, Stanford University

This symposium will bring together researchers in privacy and researchers in either artificial intelligence (AI) or human language technologies (HLTs), so that we may collectively assess the state of the art in this growing intersection of interests. Privacy remains an evolving and nuanced concern of computer users, as new technologies that use the web, smartphones, and the internet of things (IoT) collect a myriad of personal information. Rather than viewing AI and HLT as problems for privacy, the goal of this symposium is to “flip the script” and explore how AI and HLT can help meet users’ desires for privacy when interacting with computers.

It will focus on two loosely-defined research questions:

  • How can AI and HLT preserve or protect privacy in challenging situations?
  • How can AI and HLT help interested parties (e.g., computer users, companies, regulatory agencies) understand privacy in the status quo and what people want?

The symposium will consist of invited speakers, oral presentations of submitted papers, a poster session, and panel discussions. This event is a successor to Privacy and Language Technologies (“PLT”), a 2016 AAAI Fall Symposium. Submissions are due 2 November 2018.  For more information, see the symposium site.

Sherman receives 5.4m in funding for cybersecurity research and scholarships

July 26th, 2018

UMBC receives $5.4m in funding for new cybersecurity projects

NSF and NSA Fund Three Cybersecurity Projects by Prof. Alan Sherman 

Professor Alan Sherman and colleagues were recently awarded more than $5.4 million dollars in three new grants to support cybersecurity research and education at UMBC, including two from the National Science Foundation (NSF) and one from the National Security Agency (NSA).  Dr. Sherman leads UMBC’s Center for Information Security and Assurance which was responsible for UMBC’s designation as a National Center of Academic Excellence in Cybersecurity Research and Education.

This summer, NSF funded Sherman’s second CyberCorps Scholarship for Service (SFS) grant (Richard Forno, CoPI) that will fund 34 cybersecurity scholars over five years and support research at UMBC and in the Cyber Defense Lab (CDL). The $5 million award supports scholarships for BS, MS, MPS, and PhD students to study cybersecurity through UMBC degree programs in computer science, computer engineering, cyber, or information systems. SFS scholars receive tuition, books, health benefits, professional expenses, and an annual stipend ($22,500 for undergraduates, $34,000 for graduate students). In return, each scholar must engage in a summer internship and work for government (federal, state, local, or tribal) for one year for each year of support. The program is highly competitive and many of the graduates now work for the NSA.

A novel aspect of UMBC’s SFS program is that it builds connections with two nearby community colleges—Montgomery College (MC) and Prince Georges Community College (PGCC). Each year, one student from each of these schools is selected for a scholarship. Upon graduation from community college, the student transfers to UMBC to complete their four-year degree. In doing so, UMBC taps into a significant pool of talent and increases the number of cybersecurity professionals who will enter government service. Each January, all SFS scholars from UMBC, MC, and PGCC engage in a one-week research study. Working collaboratively, they analyze a targeted aspect of the security of the UMBC computer system. The students enjoy the hands-on experience while helping to improve UMBC’s computer security. Students interested in applying for an SFS scholarship should consult the CISA SFS page and contact Professor Sherman. The next application deadline is November 15.

With $310,000 of support from NSF, Sherman and his CoPIs, Drs. Dhananjay Phatak and Linda Oliva, are developing educational Cybersecurity Assessment Tools (CATS) to measure student understanding of cybersecurity concepts. In particular, they are developing and validating two concept inventories: one for any first course in cybersecurity, and one for college graduates beginning a career in cybersecurity. These inventories will provide science-based criteria by which different approaches to cybersecurity education can be assessed (e.g., competition, gaming, hands-on exercises, and traditional classroom). This project is collaborative with the University of Illinois at Urbana-Champaign.

With $97,000 of support from NSA, Sherman is developing a virtual Protocol Analysis Lab that uses state-of-the-art tools to analyze cryptographic protocols for structural weaknesses. Protocols are the structured communications that take place when computers interact with each other, as for example happens when a browser visits a web page. Experience has shown that protocols are so complicated to analyze that there is tremendous value in studying them using formal methods. Sherman and his graduate students are making it easier to use existing tools including CPSA, Maude NPA, and Tamerin, applying them to analyze particular protocols, and developing associated educational materials.

paper: Ontology-Grounded Topic Modeling for Climate Science Research

July 24th, 2018


Ontology-Grounded Topic Modeling for Climate Science Research


Jennifer Sleeman, Milton Halem and Tim Finin, Ontology-Grounded Topic Modeling for Climate Science Research, Semantic Web for Social Good Workshop, Int. Semantic Web Conf., Monterey, Oct. 2018. (Selected as best paper), to appear, Emerging Topics in Semantic Technologies, E. Demidova, A.J. Zaveri, E. Simperl (Eds.), AKA Verlag Berlin, 2018.


In scientific disciplines where research findings have a strong impact on society, reducing the amount of time it takes to understand, synthesize and exploit the research is invaluable. Topic modeling is an effective technique for summarizing a collection of documents to find the main themes among them and to classify other documents that have a similar mixture of co-occurring words. We show how grounding a topic model with an ontology, extracted from a glossary of important domain phrases, improves the topics generated and makes them easier to understand. We apply and evaluate this method to the climate science domain. The result improves the topics generated and supports faster research understanding, discovery of social networks among researchers, and automatic ontology generation.

paper: Understanding and representing the semantics of large structured documents

July 23rd, 2018

Understanding and representing the semantics of large structured documents


Muhammad Mahbubur Rahman and Tim Finin, Understanding and representing the semantics of large structured documents, Proceedings of the 4th Workshop on Semantic Deep Learning (SemDeep-4, ISWC), 8 October 2018.


Understanding large, structured documents like scholarly articles, requests for proposals or business reports is a complex and difficult task. It involves discovering a document’s overall purpose and subject(s), understanding the function and meaning of its sections and subsections, and extracting low level entities and facts about them. In this research, we present a deep learning based document ontology to capture the general purpose semantic structure and domain specific semantic concepts from a large number of academic articles and business documents. The ontology is able to describe different functional parts of a document, which can be used to enhance semantic indexing for a better understanding by human beings and machines. We evaluate our models through extensive experiments on datasets of scholarly articles from arXiv and Request for Proposal documents.

MS defense: Open Information Extraction for Code-Mixed Hindi-English Social Media Data

July 1st, 2018

MS Thesis Defense

Open Information Extraction for Code-Mixed Hindi-English Social Media Data

Mayur Pate

1:00pm Monday, 2 July 2018, ITE 325b, UMBC

Open domain relation extraction (Angeli, Premkumar, & Manning 2015) is a process of finding relation triples. While there are a number of available systems for open information extraction (Open IE) for a single language, traditional Open IE systems are not well suited to content that contains multiple languages in a single utterance. In this thesis, we have extended a existing code mix corpus (Das, Jamatia, & Gambck 2015) by finding and annotating relation triples in Open IE fashion. Using this newly annotated corpus, we have experimented with seq2seq neural network (Zhang, Duh, & Durme 2017) for finding the relationship triples. As prerequisite for relationship extraction pipeline, we have developed part-of-speech tagger and named entity and predicate recognizer for code-mix content. We have experimented with various approaches such as Conditional Random Fields (CRF), Average Perceptron and deep neural networks. According to our knowledge, this relationship extraction system is first ever contribution for any codemix natural language. We have achieved promising results for all of the components and it could be improved in future with more codemix data.

Committee: Drs. Frank Ferraro (Chair), Tim Finin, Hamed Pirsiavash, Bryan Wilkinson

paper: Attribute Based Encryption for Secure Access to Cloud Based EHR Systems

June 4th, 2018

Attribute Based Encryption for Secure Access to Cloud Based EHR Systems

Attribute Based Encryption for Secure Access to Cloud Based EHR Systems

Maithilee Joshi, Karuna Joshi and Tim Finin, Attribute Based Encryption for Secure Access to Cloud Based EHR Systems, IEEE International Conference on Cloud Computing, San Francisco CA, July 2018


Medical organizations find it challenging to adopt cloud-based electronic medical records services, due to the risk of data breaches and the resulting compromise of patient data. Existing authorization models follow a patient centric approach for EHR management where the responsibility of authorizing data access is handled at the patients’ end. This however creates a significant overhead for the patient who has to authorize every access of their health record. This is not practical given the multiple personnel involved in providing care and that at times the patient may not be in a state to provide this authorization. Hence there is a need of developing a proper authorization delegation mechanism for safe, secure and easy cloud-based EHR management. We have developed a novel, centralized, attribute based authorization mechanism that uses Attribute Based Encryption (ABE) and allows for delegated secure access of patient records. This mechanism transfers the service management overhead from the patient to the medical organization and allows easy delegation of cloud-based EHR’s access authority to the medical providers. In this paper, we describe this novel ABE approach as well as the prototype system that we have created to illustrate it.

PhD defense: Understanding the Logical and Semantic Structure of Large Documents

May 29th, 2018

Dissertation Defense

Understanding the Logical and Semantic Structure of Large Documents

Muhammad Mahbubur Rahman

11:00am Wednesday, 30 May 2018, ITE 325b

Understanding and extracting of information from large documents, such as business opportunities, academic articles, medical documents and technical reports poses challenges not present in short documents. The reasons behind this challenge are that large documents may be multi-themed, complex, noisy and cover diverse topics. This dissertation describes a framework that can analyze large documents, and help people and computer systems locate desired information in them. It aims to automatically identify and classify different sections of documents and understand their purpose within the document. A key contribution of this research is modeling and extracting the logical and semantic structure of electronic documents using deep learning techniques. The effectiveness and robustness of ?the framework is evaluated through extensive experiments on arXiv and requests for proposals datasets.

Committee Members: Drs. Tim Finin (Chair), Anupam Joshi, Tim Oates, Cynthia Matuszek, James Mayfield (JHU)