Archive for the 'Ontologies' Category
February 17th, 2016, by Tim Finin, posted in Ontologies, Security, Semantic Web
Botnet attacks turn susceptible victim computers into bots that perform various malicious activities while under the control of a botmaster. Some examples of the damage they cause include denial of service, click fraud, spamware, and phishing. These attacks can vary in the type of architecture and communication protocol used, which might be modified during the botnet lifespan. Intrusion detection and prevention systems are one way to safeguard the cyber-physical systems we use, but they have difficulty detecting new or modified attacks, including botnets. Only known attacks whose signatures have been identified and stored in some form can be discovered by most of these systems. Also, traditional IDPSs are point-based solutions incapable of utilizing information from multiple data sources and have difficulty discovering new or more complex attacks. To address these issues, we are developing a semantic approach to intrusion detection that uses a variety of sensors collaboratively. Leveraging information from these heterogeneous sources leads to a more robust, situational-aware IDPS that is better equipped to detect complicated attacks such as botnets.
December 16th, 2015, by Tim Finin, posted in cybersecurity, KR, Ontologies, Semantic Web
Zareen Syed, Ankur Padia, Tim Finin, Lisa Mathews and Anupam Joshi, UCO: Unified Cybersecurity Ontology
, AAAI Workshop on Artificial Intelligence for Cyber Security (AICS), February 2016.
In this paper we describe the Unified Cybersecurity Ontology (UCO) that is intended to support information integration and cyber situational awareness in cybersecurity systems. The ontology incorporates and integrates heterogeneous data and knowledge schemas from different cybersecurity systems and most commonly used cybersecurity standards for information sharing and exchange. The UCO ontology has also been mapped to a number of existing cybersecurity ontologies as well as concepts in the Linked Open Data cloud. Similar to DBpedia which serves as the core for general knowledge in Linked Open Data cloud, we envision UCO to serve as the core for cybersecurity domain, which would evolve and grow with the passage of time with additional cybersecurity data sets as they become available. We also present a prototype system and concrete use cases supported by the UCO ontology. To the best of our knowledge, this is the first cybersecurity ontology that has been mapped to general world ontologies to support broader and diverse security use cases. We compare the resulting ontology with previous efforts, discuss its strengths and limitations, and describe potential future work directions.
November 8th, 2015, by Tim Finin, posted in cybersecurity, Ontologies, Semantic Web
In this report, we describe the Unified Cyber Security ontology (UCO) to support situational awareness in cyber security systems. The ontology is an effort to incorporate and integrate heterogeneous information available from different cyber security systems and most commonly used cyber security standards for information sharing and exchange. The ontology has also been mapped to a number of existing cyber security ontologies as well as concepts in the Linked Open Data cloud. Similar to DBpedia which serves as the core for Linked Open Data cloud, we envision UCO to serve as the core for the specialized cyber security Linked Open Data cloud which would evolve and grow with the passage of time with additional cybersecurity data sets as they become available. We also present a prototype system and concrete use-cases supported by the UCO ontology. To the best of our knowledge, this is the first cyber security ontology that has been mapped to general world ontologies to support broader and diverse security use-cases. We compare the resulting ontology with previous efforts, discuss its strengths and limitations, and describe potential future work directions.
November 5th, 2015, by Tim Finin, posted in NLP, Ontologies, Semantic Web
Extracting Structured Summaries
from Text Documents
Dr. Zareen Syed
Research Assistant Professor, UMBC
10:30am, Monday, 9 November 2015, ITE 346, UMBC
In this talk, Dr. Syed will present unsupervised approaches for automatically extracting structured summaries composed of slots and fillers (attributes and values) and important facts from articles, thus effectively reducing the amount of time and effort spent on gathering intelligence by humans using traditional keyword based search approaches. The approach first extracts important concepts from text documents and links them to unique concepts in Wikitology knowledge base. It then exploits the types associated with the linked concepts to discover candidate slots and fillers. Finally it applies specialized approaches for ranking and filtering slots to select the most relevant slots to include in the structured summary.
Compared with the state of the art, Dr. Syed’s approach is unrestricted, i.e., it does not require manually crafted catalogue of slots or relations of interest that may vary over different domains. Unlike Natural Language Processing (NLP) based approaches that require well-formed sentences, the approach can be applied on semi-structured text. Furthermore, NLP based approaches for fact extraction extract lexical facts and sentences that require further processing for disambiguating and linking to unique entities and concepts in a knowledge base, whereas, in Dr. Syed’s approach, concept linking is done as a first step in the discovery process. Linking concepts to a knowledge base provides the additional advantage that the terms can be explicitly linked or mapped to semantic concepts in other ontologies and are thus available for reasoning in more sophisticated language understanding systems.
September 29th, 2015, by Tim Finin, posted in NLP, Ontologies, RDF, Semantic Web
Clare Grasso, Anupam Joshi and ELior Siegel, Beyond NER: Towards Semantics in Clinical Text, Biomedical Data Mining, Modeling, and Semantic Integration (BDM2I); co-located with the 14th International Semantic Web Conference (ISWC 2015), Bethlehem, PA.
While clinical text NLP systems have become very effective in recognizing named entities in clinical text and mapping them to standardized terminologies in the normalization process, there remains a gap in the ability of extractors to combine entities together into a complete semantic representation of medical concepts that contain multiple attributes each of which has its own set of allowed named entities or values. Furthermore, additional domain knowledge may be required to determine the semantics of particular tokens in the text that take on special meanings in relation to this concept. This research proposes an approach that provides ontological mappings of the surface forms of medical concepts that are of the UMLS semantic class signs/symptoms. The mappings are used to extract and encode the constituent set of named entities into interoperable semantic structures that can be linked to other structured and unstructured data for reuse in research and analysis.
June 8th, 2015, by Tim Finin, posted in AI, KR, NLP, NLP, Ontologies
Coldstart is a task in the NIST Text Analysis Conference’s Knowledge Base Population suite that combines entity linking and slot filling to populate an empty knowledge base using a predefined ontology for the facts and relations. This paper describes a system developed by the Human Language Technology Center of Excellence at Johns Hopkins University for the 2014 Coldstart task.
Tim Finin, Paul McNamee, Dawn Lawrie, James Mayfield and Craig Harman, Hot Stuff at Cold Start: HLTCOE participation at TAC 2014, 7th Text Analysis Conference, National Institute of Standards and Technology, Nov. 2014.
The JHU HLTCOE participated in the Cold Start task in this year’s Text Analysis Conference Knowledge Base Population evaluation. This is our third year of participation in the task, and we continued our research with the KELVIN system. We submitted experimental variants that explore use of forward-chaining inference, slightly more aggressive entity clustering, refined multiple within-document conference, and prioritization of relations extracted from news sources.
June 8th, 2015, by Tim Finin, posted in AI, KR, Machine Learning, Mobile Computing, Ontologies
The NSF-sponsored Platys project explored the idea that places are more than just GPS coordinates. They are concepts rich with semantic information, including people, activities, roles, functions, time and purpose. Our mobile phones can learn to recognize the places we are in and use information about them to provide better services.
Laura Zavala, Pradeep K. Murukannaiah, Nithyananthan Poosamani, Tim Finin, Anupam Joshi, Injong Rhee and Munindar P. Singh, Platys: From Position to Place-Oriented Mobile Computing, AI Magazine, v36, n2, 2015.
The Platys project focuses on developing a high-level, semantic notion of location called place. A place, unlike a geospatial position, derives its meaning from a user’s actions and interactions in addition to the physical location where it occurs. Our aim is to enable the construction of a large variety of applications that take advantage of place to render relevant content and functionality and, thus, improve user experience. We consider elements of context that are particularly related to mobile computing. The main problems we have addressed to realize our place-oriented mobile computing vision are representing places, recognizing places, and engineering place-aware applications. We describe the approaches we have developed for addressing these problems and related subproblems. A key element of our work is the use of collaborative information sharing where users’ devices share and integrate knowledge about places. Our place ontology facilitates such collaboration. Declarative privacy policies allow users to specify contextual features under which they prefer to share or not share their information.
May 11th, 2015, by Tim Finin, posted in Machine Learning, NLP, Ontologies, Semantic Web
Information Extraction from Dirty Notes
for Clinical Decision Support
10:00am Tuesday, 12 May 2015, ITE346
The term clinical decision support refers broadly to providing clinicians or patients with computer-generated clinical knowledge and patient-related information, intelligently filtered or presented at appropriate times, to enhance patient care. It is estimated that at least 50% of the clinical information describing a patient’s current condition and stage of therapy resides in the free-form text portions of the Electronic Health Record (EHR). Both linguistic and statistical natural language processing (NLP) models assume the presence of a formal underlying grammar in the text. Yet, clinical notes are often times filled with overloaded and nonstandard abbreviations, sentence fragments, and creative punctuation that make it difficult for grammar-based NLP systems to work effectively. This research focuses on investigating scalable machine learning and semantic techniques that do not rely on an underlying grammar to extract medical concepts in the text in order to apply them in CDS on commodity hardware and software systems. Additionally, by packaging the extracted data within a semantic knowledge representation, the facts can be combined with other semantically encoded facts and reasoned over to help to inform clinicians in their decision making.
April 27th, 2015, by Tim Finin, posted in NLP, Ontologies, OWL, RDF, Semantic Web
In this weeks ebiquity lab meeting, Ankur Padia will talk about ontology learning and the work he did for his MS thesis at 10:00am in ITE 346 at UMBC.
10:00am Tuesday, Apr. 28, 2015, ITE 346
Ontology Learning has been the subject of intensive study for the past decade. Researchers in this field have been motivated by the possibility of automatically building a knowledge base on top of text documents so as to support reasoning based knowledge extraction. While most works in this field have been primarily statistical (known as light-weight Ontology Learning) not much attempt has been made in axiomatic Ontology Learning (called Formal Ontology Learning) from Natural Language text documents. Presentation will focus on the relationship between Description Logic and Natural Language (limited to IS-A) for Formal Ontology Learning.
April 25th, 2015, by Tim Finin, posted in AI, Ontologies, OWL, Semantic Web
Ph.D. Dissertation Defense
A Semantic Resolution Framework for Integrating
Manufacturing Service Capability Data
10:00am Monday 27 April 2015, ITE 217b
Building flexible manufacturing supply chains requires availability of interoperable and accurate manufacturing service capability (MSC) information of all supply chain participants. Today, MSC information, which is typically published either on the supplier’s web site or registered at an e-marketplace portal, has been shown to fall short of interoperability and accuracy requirements. The issue of interoperability can be addressed by annotating the MSC information using shared ontologies. However, this ontology-based approach faces three main challenges: (1) lack of an effective way to automatically extract a large volume of MSC instance data hidden in the web sites of manufacturers that need to be annotated; (2) difficulties in accurately identifying semantics of these extracted data and resolving semantic heterogeneities among individual sources of these data while integrating them under shared formal ontologies; (3) difficulties in the adoption of ontology-based approaches by the supply chain managers and users because of their unfamiliarity with the syntax and semantics of formal ontology languages such as the web ontology language (OWL).
The objective of our research is to address the main challenges of ontology-based approaches by developing an innovative approach that is able to extract MSC instances from a broad range of manufacturing web sites that may present MSC instances in various ways, accurately annotate MSC instances with formal defined semantics on a large scale, and integrate these annotated MSC instances into formal manufacturing domain ontologies to facilitate the formation of supply chains of manufacturers. To achieve this objective, we propose a semantic resolution framework (SRF) that consists of three main components: a MSC instance extractor, a MSC Instance annotator and a semantic resolution knowledge base. The instance extractor builds a local semantic model that we call instance description model (IDM) for each target manufacturer web site. The innovative aspect of the IDM is that it captures the intended structure of the target web site and associates each extracted MSC instance with a context that describes possible semantics of that instance. The instance annotator starts the semantic resolution by identifying the most appropriate class from a (or a set of) manufacturing domain ontology (or ontologies) (MDO) to annotate each instance based on the mappings established between the context of that instance and the vocabularies (i.e., classes and properties) defined in the MDO. The primary goal of the semantic resolution knowledge base (SR-KB) is to resolve semantic heterogeneity that may occur in the instance annotation process and thus improve the accuracy of the annotated MSC instances. The experimental results demonstrate that the instance extractor and the instance annotator can effectively discover and annotate MSC instances while the SR-KB is able to improve both precision and recall of annotated instances and reducing human involvement along with the evolution of the knowledge base.
Committee: Drs. Yun Peng (Chair), Tim Finin, Yaacov Yesha, Matthew Schmill and Boonserm Kulvatunyou
January 14th, 2015, by Tim Finin, posted in Agents, AI, Big data, Ontologies, Semantic Web, Web
The theme of the 2015 Ontology Summit is Internet of Things: Toward Smart Networked Systems and Societies. The Ontology Summit is an annual series of events (first started by Ontolog and NIST in 2006) that involve the ontology community and communities related to each year’s theme.
The 2015 Summit will hold a virtual discourse over the next three months via mailing lists and online panel sessions augmented conference calls. The Summit will culminate in a two-day face-to-face workshop on 13-14 April 2015 in Arlington, VA. The Summit’s goal is to explore how ontologies can play a significant role in the realization of smart networked systems and societies in the Internet of Things.
The Summit’s initial launch session will take place from 12:30pm to 2:00pm EDT on Thursday, January 15th and will include overview presentations from each of the four technical tracks. See the 2015 Ontology Summit for more information, the schedule and details on how to participate in these free an open events.
December 29th, 2014, by Tim Finin, posted in KR, Machine Learning, NLP, Ontologies, Semantic Web
TABEL — A Domain Independent and Extensible Framework
for Inferring the Semantics of Tables
8:00am Thursday, 8 January 2015, ITE325b
Tables are an integral part of documents, reports and Web pages in many scientific and technical domains, compactly encoding important information that can be difficult to express in text. Table-like structures outside documents, such as spreadsheets, CSV files, log files and databases, are widely used to represent and share information. However, tables remain beyond the scope of regular text processing systems which often treat them like free text.
This dissertation presents TABEL — a domain independent and extensible framework to infer the semantics of tables and represent them as RDF Linked Data. TABEL captures the intended meaning of a table by mapping header cells to classes, data cell values to existing entities and pair of columns to relations from an given ontology and knowledge base. The core of the framework consists of a module that represents a table as a graphical model to jointly infer the semantics of headers, data cells and relation between headers. We also introduce a novel Semantic Message Passing scheme, which incorporates semantics into message passing, to perform joint inference over the probabilistic graphical model. We also develop and explore a “human-in-the-loop” paradigm, presenting plausible models of user interaction with our framework and its impact on the quality of inferred semantics.
We present techniques that are both extensible and domain agnostic. Our framework supports the addition of preprocessing modules without affecting existing ones, making TABEL extensible. It also allows background knowledge bases to be adapted and changed based on the domains of the tables, thus making it domain independent. We demonstrate the extensibility and domain independence of our techniques by developing an application of TABEL in the healthcare domain. We develop a proof of concept for an application to generate meta-analysis reports automatically, which is built on top of the semantics inferred from tables found in medical literature.
A thorough evaluation with experiments over dataset of tables from the Web and medical research reports presents promising results.
Committee: Drs. Tim Finin (chair), Tim Oates, Anupam Joshi, Yun Peng, Indrajit Bhattacharya (IBM Research) and L. V. Subramaniam (IBM Research)
You are currently browsing the archives for the Ontologies category.