Archive for the 'Web' Category
March 17th, 2017, by Tim Finin, posted in AI, KR, NLP, NLP, Ontologies, OWL, RDF, Semantic Web
The Semantics Toolkit
Paul Cuddihy and Justin McHugh
GE Global Research Center, Niskayuna, NY
rescheduled due to weather
10:30-11:30 Tuesday, 4 April 2017, ITE 346, UMBC
Paul Cuddihy is a senior computer scientist and software systems architect in AI and Learning Systems at the GE Global Research Center in Niskayuna, NY. He earned an M.S. in Computer Science from Rochester Institute of Technology. The focus of his twenty-year career at GE Research has ranged from machine learning for medical imaging equipment diagnostics, monitoring and diagnostic techniques for commercial aircraft engines, modeling techniques for monitoring seniors living independently in their own homes, to parallel execution of simulation and prediction tasks, and big data ontologies. He is one of the creators of the open source software “Semantics Toolkit” (SemTk) which provides a simplified interface to the semantic tech stack, opening its use to a broader set of users by providing features such as drag-and-drop query generation and data ingestion. Paul has holds over twenty U.S. patents.
Justin McHugh is computer scientist and software systems architect working in the AI and Learning Systems group at GE Global Research in Niskayuna, NY. Justin attended the State University of New York at Albany where he earned an M.S in computer science. He has worked as a systems architect and programmer for large scale reporting, before moving into the research sector. In the six years since, he has worked on complex system integration, Big Data systems and knowledge representation/querying systems. Justin is one of the architects and creators of SemTK (the Semantics Toolkit), a toolkit aimed at making the power of the semantic web stack available to programmers, automation and subject matter experts without their having to be deeply invested in the workings of the Semantic Web.
March 4th, 2017, by Tim Finin, posted in KR, Ontologies, OWL, RDF, Semantic Web
SADL – Semantic Application Design Language
Dr. Andrew W. Crapo
GE Global Research
10:00 Tuesday, 7 March 2017
The Web Ontology Language (OWL) has gained considerable acceptance over the past decade. Building on prior work in Description Logics, OWL has sufficient expressivity to be useful in many modeling applications. However, its various serializations do not seem intuitive to subject matter experts in many domains of interest to GE. Consequently, we have developed a controlled-English language and development environment that attempts to make OWL plus rules more accessible to those with knowledge to share but limited interest in studying formal representations. The result is the Semantic Application Design Language (SADL). This talk will review the foundational underpinnings of OWL and introduce the SADL constructs meant to capture, validate, and maintain semantic models over their lifecycle.
Dr. Crapo has been part of GE’s Global Research staff for over 35 years. As an Information Scientist he has built performance and diagnostic models of mechanical, chemical, and electrical systems, and has specialized in human-computer interfaces, decision support systems, machine reasoning and learning, and semantic representation and modeling. His work has included a graphical expert system language (GEN-X), a graphical environment for procedural programming (Fuselet Development Environment), and a semantic-model-driven user-interface for decision support systems (ACUITy). Most recently Andy has been active in developing the Semantic Application Design Language (SADL), enabling GE to leverage worldwide advances and emerging standards in semantic technology and bring them to bear on diverse problems from equipment maintenance optimization to information security.
November 29th, 2016, by Tim Finin, posted in KR, Machine Learning, NLP, NLP, Semantic Web
Dealing with Dubious Facts
in Knowledge Graphs
1:00-3:00pm Wednesday, 30 November 2016, ITE 325b, UMBC
Knowledge graphs are structured representations of facts where nodes are real-world entities or events and edges are the associations among the pair of entities. Knowledge graphs can be constructed using automatic or manual techniques. Manual techniques construct high quality knowledge graphs but are expensive, time consuming and not scalable. Hence, automatic information extraction techniques are used to create scalable knowledge graphs but the extracted information can be of poor quality due to the presence of dubious facts.
An extracted fact is dubious if it is incorrect, inexact or correct but lacks evidence. A fact might be dubious because of the errors made by NLP extraction techniques, improper design consideration of the internal components of the system, choice of learning techniques (semi-supervised or unsupervised), relatively poor quality of heuristics or the syntactic complexity of underlying text. A preliminary analysis of several knowledge extraction systems (CMU’s NELL and JHU’s KELVIN) and observations from the literature suggest that dubious facts can be identified, diagnosed and managed. In this dissertation, I will explore approaches to identify and repair such dubious facts from a knowledge graph using several complementary approaches, including linguistic analysis, common sense reasoning, and entity linking.
Committee: Drs. Tim Finin (Chair), Anupam Joshi, Tim Oates, Paul McNamee (JHU), Partha Talukdar (IISc, India)
May 24th, 2016, by Tim Finin, posted in cloud computing, cybersecurity, Privacy, Security, Semantic Web
Vaishali Narkhede, Karuna Pande Joshi, Tim Finin, Seung Geol Choi, Adam Aviv and Daniel S. Roche, Managing Cloud Storage Obliviously
, International Conference on Cloud Computing, IEEE Computer Society, June 2016.
Consumers want to ensure that their enterprise data is stored securely and obliviously on the cloud, such that the data objects or their access patterns are not revealed to anyone, including the cloud provider, in the public cloud environment. We have created a detailed ontology describing the oblivious cloud storage models and role based access controls that should be in place to manage this risk. We have developed an algorithm to store cloud data using oblivious data structure defined in this paper. We have also implemented the ObliviCloudManager application that allows users to manage their cloud data by validating it before storing it in an oblivious data structure. Our application uses role-based access control model and collection based document management to store and retrieve data efficiently. Cloud consumers can use our system to define policies for storing data obliviously and manage storage on untrusted cloud platforms even if they are unfamiliar with the underlying technology and concepts of oblivious data structures.
May 22nd, 2016, by Tim Finin, posted in cloud computing, KR, Ontologies, Semantic Web
With the increase in the number of cloud services and service providers, manual analysis of Service Level Agreements (SLA), comparison between different service offerings and conformance regulation has become a difficult task for customers. Cloud SLAs are policy documents describing the legal agreement between cloud providers and customers. SLA specifies the commitment of availability, performance of services, penalties associated with violations and procedure for customers to receive compensations in case of service disruptions. The aim of our research is to develop technology solutions for automated cloud service management using Semantic Web and Text Mining techniques. In this paper we discuss in detail the challenges in automating cloud services management and present our preliminary work in extraction of knowledge from SLAs of different cloud services. We extracted two types of information from the SLA documents which can be useful for end users. First, the relationship between the service commitment and financial credit. We represented this information by enhancing the existing Cloud service ontology proposed by us in our previous research. Second, we extracted rules in the form of obligations and permissions from SLAs using modal and deontic logic formalizations. For our analysis, we considered six publicly available SLA documents from different cloud computing service providers.
May 20th, 2016, by Tim Finin, posted in Social media
Profile linking is the ability to connect profiles of a user on different social networks. Linked profiles can help companies like Disney to build psychographics of potential customers and segment them for targeted marketing in a cost-effective way. Existing methods link profiles by observing high similarity between most recent (current) values of the attributes like name and username. However, for a section of users observed to evolve their attributes over time and choose dissimilar values across their profiles, these current values have low similarity. Existing methods then falsely conclude that profiles refer to different users. To reduce such false conclusions, we suggest to gather rich history of values assigned to an attribute over time and compare attribute histories to link user profiles across networks. We believe that attribute history highlights user preferences for creating attribute values on a social network. Co-existence of these preferences across profiles on different social networks result in alike attribute histories that suggests profiles potentially refer to a single user. Through a focused study on username, we quantify the importance of username history for profile linking on a dataset of real-world users with profiles on Twitter, Facebook, Instagram and Tumblr. We show that username history correctly links 44% more profile pairs with non-matching current values that are incorrectly unlinked by existing methods. We further explore if factors such as longevity and availability of username history on either profiles affect linking performance. To the best of our knowledge, this is the first study that explores viability of using an attribute history to link profiles on social networks.
May 2nd, 2016, by Tim Finin, posted in KR, Ontologies, Semantic Web
He’s dead, Jim.
Google recently shut down the query interface to Freebase. All that is left of this innovative service is the ability to download a few final data dumps.
Freebase was launched nine years ago by Metaweb as an online source of structured data collected from Wikipedia and many other sources, including individual, user-submitted uploads and edits. Metaweb was acquired by Google in July 2010 and Freebase subsequently grew to have more than 2.4 billion facts about 44 million subjects. In December 2014, Google announced that it was closing Freebase and four months later it became read-only. Sometime this week the query interface was shut down.
I’ve enjoyed using Freebase in various projects in the past two years and found that it complemented DBpedia in many ways. Although its native semantics differed from that of RDF and OWL, it was close enough to allow all of Freebase to be exported as RDF. Its schema was larger than DBpedia’s and the data tended to be a bit cleaner.
Google generously decided to donate the data to the Wikidata project, which began migrating Freebase’s data to Wikidata in 2015. The Freebase data also lives on as part of Google’s Knowledge Graph. Google recently allowed very limited querying of its knowledge graph and my limited experimenting with it suggests that has Freebase data at its core.
May 1st, 2016, by Tim Finin, posted in KR, NLP, Ontologies, Semantic Web
Representing and Reasoning with Temporal
Properties/Relations in OWL/RDF
10:30-11:30 Monday, 2 May 2016, ITE346
OWL ontologies offer the means for modeling real-world domains by representing their high-level concepts, properties and interrelationships. These concepts and their properties are connected by means of binary relations. However, this assumes that the model of the domain is either a set of static objects and relationships that do not change over time, or a snapshot of these objects at a particular point in time. In general, relationships between objects that change over time (dynamic properties) are not binary relations, since they involve a temporal interval in addition to the object and the subject. Representing and querying information evolving in time requires careful consideration of how to use OWL constructs to model dynamic relationships and how the semantics and reasoning capabilities within that architecture are affected.
April 18th, 2016, by Tim Finin, posted in IoT, Policy, Semantic Web
Prajit Kumar Das, Sandeep Nair, Nitin Kumar Sharma, Anupam Joshi, Karuna Pande Joshi, and Tim Finin, Context-Sensitive Policy Based Security in Internet of Things
, 1st IEEE Workshop on Smart Service Systems
, co-located with IEEE Int. Conf. on Smart Computing, St. Louis, 18 May 2016.
According to recent media reports, there has been a surge in the number of devices that are being connected to the Internet. The Internet of Things (IoT), also referred to as Cyber-Physical Systems, is a collection of physical entities with computational and communication capabilities. The storage and computing power of these devices is often limited and their designs currently focus on ensuring functionality and largely ignore other requirements, including security and privacy concerns. We present the design of a framework that allows IoT devices to capture, represent, reason with, and enforce information sharing policies. We use Semantic Web technologies to represent the policies, the information to be shared or protected, and the IoT device context. We discuss use-cases where our design will help in creating an “intelligent” IoT device and ensuring data security and privacy using context-sensitive information sharing policies.
April 3rd, 2016, by Tim Finin, posted in cybersecurity, Ontologies, OWL, RDF, Security, Semantic Web
Policies For Oblivious Cloud Storage
Using Semantic Web Technologies
10:30am, Monday, 4 April 2016, ITE 346, UMBC
Consumers want to ensure that their enterprise data is stored securely and obliviously on the cloud, such that the data objects or their access patterns are not revealed to anyone, including the cloud provider, in the public cloud environment. We have created a detailed ontology describing the oblivious cloud storage models and role based access controls that should be in place to manage this risk. We have also implemented the ObliviCloudManager application that allows users to manage their cloud data using oblivious data structures. This application uses role based access control model and collection based document management to store and retrieve data efficiently. Cloud consumers can use our system to define policies for storing data obliviously and manage storage on untrusted cloud platforms, even if they are not familiar with the underlying technology and concepts of the oblivious data structure.
February 17th, 2016, by Tim Finin, posted in Ontologies, Security, Semantic Web
Botnet attacks turn susceptible victim computers into bots that perform various malicious activities while under the control of a botmaster. Some examples of the damage they cause include denial of service, click fraud, spamware, and phishing. These attacks can vary in the type of architecture and communication protocol used, which might be modified during the botnet lifespan. Intrusion detection and prevention systems are one way to safeguard the cyber-physical systems we use, but they have difficulty detecting new or modified attacks, including botnets. Only known attacks whose signatures have been identified and stored in some form can be discovered by most of these systems. Also, traditional IDPSs are point-based solutions incapable of utilizing information from multiple data sources and have difficulty discovering new or more complex attacks. To address these issues, we are developing a semantic approach to intrusion detection that uses a variety of sensors collaboratively. Leveraging information from these heterogeneous sources leads to a more robust, situational-aware IDPS that is better equipped to detect complicated attacks such as botnets.
December 16th, 2015, by Tim Finin, posted in cybersecurity, KR, Ontologies, Semantic Web
Zareen Syed, Ankur Padia, Tim Finin, Lisa Mathews and Anupam Joshi, UCO: Unified Cybersecurity Ontology
, AAAI Workshop on Artificial Intelligence for Cyber Security (AICS), February 2016.
In this paper we describe the Unified Cybersecurity Ontology (UCO) that is intended to support information integration and cyber situational awareness in cybersecurity systems. The ontology incorporates and integrates heterogeneous data and knowledge schemas from different cybersecurity systems and most commonly used cybersecurity standards for information sharing and exchange. The UCO ontology has also been mapped to a number of existing cybersecurity ontologies as well as concepts in the Linked Open Data cloud. Similar to DBpedia which serves as the core for general knowledge in Linked Open Data cloud, we envision UCO to serve as the core for cybersecurity domain, which would evolve and grow with the passage of time with additional cybersecurity data sets as they become available. We also present a prototype system and concrete use cases supported by the UCO ontology. To the best of our knowledge, this is the first cybersecurity ontology that has been mapped to general world ontologies to support broader and diverse security use cases. We compare the resulting ontology with previous efforts, discuss its strengths and limitations, and describe potential future work directions.