The rise in popularity of Internet of Things (IoT) devices has opened doors for privacy and security breaches in Cyber-Physical systems like smart homes, smart vehicles, and smart grids that affect our daily existence. IoT systems are also a source of big data that gets shared via the cloud. IoT systems in a smart home environment have sensitive access control issues since they are deployed in a personal space. The collected data can also be of a highly personal nature. Therefore, it is critical to building access control models that govern who, under what circumstances, can access which sensed data or actuate a physical system. Traditional access control mechanisms are not expressive enough to handle such complex access control needs, warranting the incorporation of new methodologies for privacy and security. In this paper, we propose the creation of the PALS system, that builds upon existing work in an attribute-based access control model, captures physical context collected from sensed data (attributes) and performs dynamic reasoning over these attributes and context-driven policies using Semantic Web technologies to execute access control decisions. Reasoning over user context, details of the information collected by the cloud service provider, and device type our mechanism generates as a consequent access control decisions. Our system’s access control decisions are supplemented by another sub-system that detects intrusions into smart home systems based on both network and behavioral data. The combined approach serves to determine indicators that a smart home system is under attack, as well as limit what data breach such attacks can achieve.
Automating GDPR Compliance using Policy Integrated Blockchain
Early Detection of Cybersecurity Threats Using Collaborative Cognition
The early detection of cybersecurity events such as attacks is challenging given the constantly evolving threat landscape. Even with advanced monitoring, sophisticated attackers can spend more than 100 days in a system before being detected. This paper describes a novel, collaborative framework that assists a security analyst by exploiting the power of semantically rich knowledge representation and reasoning integrated with different machine learning techniques. Our Cognitive Cybersecurity System ingests information from various textual sources and stores them in a common knowledge graph using terms from an extended version of the Unified Cybersecurity Ontology. The system then reasons over the knowledge graph that combines a variety of collaborative agents representing host and network-based sensors to derive improved actionable intelligence for security administrators, decreasing their cognitive load and increasing their confidence in the result. We describe a proof of concept framework for our approach and demonstrate its capabilities by testing it against a custom-built ransomware similar to WannaCry.
Ontology-Grounded Topic Modeling for Climate Science Research
In scientific disciplines where research findings have a strong impact on society, reducing the amount of time it takes to understand, synthesize and exploit the research is invaluable. Topic modeling is an effective technique for summarizing a collection of documents to find the main themes among them and to classify other documents that have a similar mixture of co-occurring words. We show how grounding a topic model with an ontology, extracted from a glossary of important domain phrases, improves the topics generated and makes them easier to understand. We apply and evaluate this method to the climate science domain. The result improves the topics generated and supports faster research understanding, discovery of social networks among researchers, and automatic ontology generation.
Understanding and representing the semantics of large structured documents
Understanding large, structured documents like scholarly articles, requests for proposals or business reports is a complex and difficult task. It involves discovering a document’s overall purpose and subject(s), understanding the function and meaning of its sections and subsections, and extracting low level entities and facts about them. In this research, we present a deep learning based document ontology to capture the general purpose semantic structure and domain specific semantic concepts from a large number of academic articles and business documents. The ontology is able to describe different functional parts of a document, which can be used to enhance semantic indexing for a better understanding by human beings and machines. We evaluate our models through extensive experiments on datasets of scholarly articles from arXiv and Request for Proposal documents.
Attribute Based Encryption for Secure Access to Cloud Based EHR Systems
Medical organizations find it challenging to adopt cloud-based electronic medical records services, due to the risk of data breaches and the resulting compromise of patient data. Existing authorization models follow a patient centric approach for EHR management where the responsibility of authorizing data access is handled at the patients’ end. This however creates a significant overhead for the patient who has to authorize every access of their health record. This is not practical given the multiple personnel involved in providing care and that at times the patient may not be in a state to provide this authorization. Hence there is a need of developing a proper authorization delegation mechanism for safe, secure and easy cloud-based EHR management. We have developed a novel, centralized, attribute based authorization mechanism that uses Attribute Based Encryption (ABE) and allows for delegated secure access of patient records. This mechanism transfers the service management overhead from the patient to the medical organization and allows easy delegation of cloud-based EHR’s access authority to the medical providers. In this paper, we describe this novel ABE approach as well as the prototype system that we have created to illustrate it.
Understanding the Logical and Semantic Structure of Large Documents
Muhammad Mahbubur Rahman
11:00am Wednesday, 30 May 2018, ITE 325b
Understanding and extracting of information from large documents, such as business opportunities, academic articles, medical documents and technical reports poses challenges not present in short documents. The reasons behind this challenge are that large documents may be multi-themed, complex, noisy and cover diverse topics. This dissertation describes a framework that can analyze large documents, and help people and computer systems locate desired information in them. It aims to automatically identify and classify different sections of documents and understand their purpose within the document. A key contribution of this research is modeling and extracting the logical and semantic structure of electronic documents using deep learning techniques. The effectiveness and robustness of ?the framework is evaluated through extensive experiments on arXiv and requests for proposals datasets.
Committee Members: Drs. Tim Finin (Chair), Anupam Joshi, Tim Oates, Cynthia Matuszek, James Mayfield (JHU)
Cleaning Noisy Knowledge Graphs
My dissertation research is developing an approach to identify and explain errors in a knowledge graph constructed by extracting entities and relations from text. Information extraction systems can automatically construct knowledge graphs from a large collection of documents, which might be drawn from news articles, Web pages, social media posts or discussion forums. The language understanding task is challenging and current extraction systems introduce many kinds of errors. Previous work on improving the quality of knowledge graphs uses additional evidence from background knowledge bases or Web searches. Such approaches are diffuclt to apply when emerging entities are present and/or only one knowledge graph is available. In order to address the problem I am using multiple complementary techniques including entitylinking, common sense reasoning, and linguistic analysis.
W3C Recommendation: Time Ontology in OWL
The Spatial Data on the Web Working Group has published a W3C Recommendation of the Time Ontology in OWL specification. The ontology provides a vocabulary for expressing facts about relations among instants and intervals, together with information about durations, and about temporal position including date-time information. Time positions and durations may be expressed using either the conventional Gregorian calendar and clock, or using another temporal reference system such as Unix-time, geologic time, or different calendars.
2018 Ontology Summit: Ontologies in Context
The OntologySummit is an annual series of online and in-person events that involves the ontology community and communities related to each year’s topic. The topic chosen for the 2018 Ontology Summit will be Ontologies in Context, which the summit describes as follows.
“In general, a context is defined to be the circumstances that form the setting for an event, statement, or idea, and in terms of which it can be fully understood and assessed. Some examples of synonyms include circumstances, conditions, factors, state of affairs, situation, background, scene, setting, and frame of reference. There are many meanings of “context” in general, and also for ontologies in particular. The summit this year will survey these meanings and identify the research problems that must be solved so that contexts can succeed in achieving the full understanding and assessment of an ontology.”
Each year’s Summit comprises of a series of both online and face-to-face events that span about three months. These include a vigorous three-month online discourse on the theme, and online panel discussions, research activities which will culminate in a two-day face-to-face workshop and symposium.
Over the next two months, there will be a sequence of weekly online meetings to discuss, plan and develop the 2018 topic. The summit itself will start in January with weekly online sessions of invited speakers. Visit the the 2018 Ontology Summit site for more information and to see how you can participate in the planning sessions.
Context-Dependent Privacy and Security Management on Mobile Devices
There are ongoing security and privacy concerns regarding mobile platforms that are being used by a growing number of citizens. Security and privacy models typically used by mobile platforms use one-time permission acquisition mechanisms. However, modifying access rights after initial authorization in mobile systems is often too tedious and complicated for users. User studies show that a typical user does not understand permissions requested by applications or are too eager to use the applications to care to understand the permission implications. For example, the Brightest Flashlight application was reported to have logged precise locations and unique user identifiers, which have nothing to do with a flashlight application’s intended functionality, but more than 50 million users used a version of this application which would have forced them to allow this permission. Given the penetration of mobile devices into our lives, a fine-grained context-dependent security and privacy control approach needs to be created.
We have created Mithril as an end-to-end mobile access control framework that allows us to capture access control needs for specific users, by observing violations of known policies. The framework studies mobile application executables to better inform users of the risks associated with using certain applications. The policy capture process involves an iterative user feedback process that captures policy modifications required to mediate observed violations. Precision of policy is used to determine convergence of the policy capture process. Policy rules in the system are written using Semantic Web technologies and the Platys ontology to define a hierarchical notion of context. Policy rule antecedents are comprised of context elements derived using the Platys ontology employing a query engine, an inference mechanism and mobile sensors. We performed a user study that proves the feasibility of using our violation driven policy capture process to gather user-specific policy modifications.
We contribute to the static and dynamic study of mobile applications by defining “application behavior” as a possible way of understanding mobile applications and creating access control policies for them. Our user study also shows that unlike our behavior-based policy, a “deny by default” mechanism hampers usability of access control systems. We also show that inclusion of crowd-sourced policies leads to further reduction in user burden and need for engagement while capturing context-based access control policy. We enrich knowledge about mobile “application behavior” and expose this knowledge through the Mobipedia knowledge-base. We also extend context synthesis for semantic presence detection on mobile devices by combining Bluetooth, low energy beacons and Nearby Messaging services from Google.
Ph.D. Dissertation Defense
Dynamic Data Assimilation for Topic Modeling
9:00am Thursday, 29 June 2017, ITE 325b, UMBC
Understanding how a particular discipline such as climate science evolves over time has received renewed interest. By understanding this evolution, predicting the future direction of that discipline becomes more achievable. Dynamic Topic Modeling (DTM) has been applied to a number of disciplines to model topic evolution as a means to learn how a particular scientific discipline and its underlying concepts are changing. Understanding how a discipline evolves, and its internal and external influences, can be complicated by how the information retrieved over time is integrated. There are different techniques used to integrate sources of information, however, less research has been dedicated to understanding how to integrate these sources over time. The method of data assimilation is commonly used in a number of scientific disciplines to both understand and make predictions of various phenomena, using numerical models and assimilated observational data over time.
In this dissertation, I introduce a novel algorithm for scientific data assimilation, called Dynamic Data Assimilation for Topic Modeling (DDATM), which uses a new cross-domain divergence method (CDDM) and DTM. By using DDATM, observational data in the form of full-text research papers can be assimilated over time starting from an initial model. DDATM can be used as a way to integrate data from multiple sources and, due to its robustness, can exploit the assimilating observational information to better tolerate missing model information. When compared with a DTM model, the assimilated model is shown to have better performance using standard topic modeling measures, including perplexity and topic coherence. The DDATM method is suitable for prediction and results in higher likelihood for subsequent documents. DDATM is able to overcome missing information during the assimilation process when compared with a DTM model. CDDM generalizes as a method that can also bring together multiple disciplines into one cohesive model enabling the identification of related concepts and documents across disciplines and time periods. Finally, grounding the topic modeling process with an ontology improves the quality of the topics and enables a more granular understanding of concept relatedness and cross-domain influence.
The results of this dissertation are demonstrated and evaluated by applying DDATM to 30 years of reports from the Intergovernmental Panel on Climate Change (IPCC) along with more than 150,000 documents that they cite to show the evolution of the physical basis of climate change.
Committee Members: Drs. Tim Finin (co-advisor), Milton Halem (co-advisor), Anupam Joshi, Tim Oates, Cynthia Matuszek, Mark Cane, Rafael Alonso