Archive for the 'AI' Category
December 12th, 2017, by Tim Finin, posted in AI, Earth science, Machine Learning, NLP, NLP
Jennifer Sleeman receives AI for Earth grant from Microsoft
Visiting Assistant Professor Jennifer Sleeman (Ph.D. ’17) has been awarded a grant from Microsoft as part of its ‘AI for Earth’ program. Dr. Sleeman will use the grant to continue her research on developing algorithms to model how scientific disciplines such as climate change evolve and predict future trends by analyzing the text of articles and reports and the papers they cite.
AI for Earth is a Microsoft program aimed at empowering people and organizations to solve global environmental challenges by increasing access to AI tools and educational opportunities, while accelerating innovation. Via the Azure for Research AI for Earth award program, Microsoft provides selected researchers and organizations access to its cloud and AI computing resources to accelerate, improve and expand work on climate change, agriculture, biodiversity and/or water challenges.
UMBC is among the first grant recipients of AI for Earth, first launched in July 2017. The grant process was a competitive and selective process and was awarded in recognition of the potential of the work and power of AI to accelerate progress.
As part of her dissertation research, Dr. Sleeman developed algorithms using dynamic topic modeling to understand influence and predict future trends in a scientific discipline. She applied this to the field of climate change and used assessment reports of the Intergovernmental Panel on Climate Change (IPCC) and the papers they cite. Since 1990, an IPCC report has been published every five years that includes four separate volumes, each of which has many chapters. Each report cites tens of thousands of research papers, which comprise a correlated dataset of temporally grounded documents. Her custom dynamic topic modeling algorithm identified topics for both datasets and apply cross-domain analytics to identify the correlations between the IPCC chapters and their cited documents. The approach reveals both the influence of the cited research on the reports and how previous research citations have evolved over time.
Dr. Sleeman’s award is part of an inaugural set of 35 grants in more than ten countries for access to Microsoft Azure and AI technology platforms, services and training. In an post on Monday, AI for Earth can be a game-changer for our planet, Microsoft announced its intent to put $50 million over five years into the program, enabling grant-making and educational trainings possible at a much larger scale.
More information about AI for Earth can be found on the Microsoft AI for Earth website.
November 28th, 2017, by Tim Finin, posted in NLP, OWL, Semantic Web
Automated Knowledge Extraction from the Federal Acquisition Regulations System (FARS)
With increasing regulation of Big Data, it is becoming essential for organizations to ensure compliance with various data protection standards. The Federal Acquisition Regulations System (FARS) within the Code of Federal Regulations (CFR) includes facts and rules for individuals and organizations seeking to do business with the US Federal government. Parsing and gathering knowledge from such lengthy regulation documents is currently done manually and is time and human intensive.Hence, developing a cognitive assistant for automated analysis of such legal documents has become a necessity. We have developed semantically rich approach to automate the analysis of legal documents and have implemented a system to capture various facts and rules contributing towards building an ef?cient legal knowledge base that contains details of the relationships between various legal elements, semantically similar terminologies, deontic expressions and cross-referenced legal facts and rules. In this paper, we describe our framework along with the results of automating knowledge extraction from the FARS document (Title48, CFR). Our approach can be used by Big Data Users to automate knowledge extraction from Large Legal documents.
November 20th, 2017, by Tim Finin, posted in AI
Voices in AI – Episode 20: A Conversation with Marie desJardins
Byron Reese interviewed UMBC CSEE Professor Marie desJardins as part of his Voices in AI podcast series on Gigaom. In the episode, they talk about the Turing test, Watson, autonomous vehicles, and language processing. Visit the Voices in AI site to listen to the podcast and read the interview transcript.
Here’s the start of the wide-ranging, hour-long interview.
Byron Reese: This is Voices in AI, brought to you by Gigaom. I’m Byron Reese. Today I’m excited that our guest is Marie des Jardins. She is an Associate Dean for Engineering and Information Technology as well as a professor of Computer Science at the University of Maryland, Baltimore County. She got her undergrad degree from Harvard, and a Ph.D. in computer science from Berkeley, and she’s been involved in the National Conference of the Association for the Advancement of Artificial Intelligence for over 12 years. Welcome to the show, Marie.
Marie des Jardins: Hi, it’s nice to be here.
I often open the show with “What is artificial intelligence?” because, interestingly, there’s no consensus definition of it, and I get a different kind of view of it from everybody. So I’ll start with that. What is artificial intelligence?
Sure. I’ve always thought about artificial intelligence as just a very broad term referring to trying to get computers to do things that we would consider intelligent if people did them. What’s interesting about that definition is it’s a moving target, because we change our opinions over time about what’s intelligent. As computers get better at doing things, they no longer seem that intelligent to us.
We use the word “intelligent,” too, and I’m not going to dwell on definitions, but what do you think intelligence is at its core?
So, it’s definitely hard to pin down, but I think of it as activities that human beings carry out, that we don’t know of lower order animals doing, other than some of the higher primates who can do things that seem intelligent to us. So intelligence involves intentionality, which means setting goals and making active plans to carry them out, and it involves learning over time and being able to react to situations differently based on experiences and knowledge that we’ve gained over time. The third part, I would argue, is that intelligence includes communication, so the ability to communicate with other beings, other intelligent agents, about your activities and goals.
Well, that’s really useful and specific. Let’s look at some of those things in detail a little bit. You mentioned intentionality. Do you think that intentionality is driven by consciousness? I mean, can you have intentionality without consciousness? Is consciousness therefore a requisite for intelligence?
I think that’s a really interesting question. I would decline to answer it mainly because I don’t think we ever can really know what consciousness is. We all have a sense of being conscious inside our own brains—at least I believe that. But of course, I’m only able to say anything meaningful about my own sense of consciousness. We just don’t have any way to measure consciousness or even really define what it is. So, there does seem to be this idea of self-awareness that we see in various kinds of animals—including humans—and that seems to be a precursor to what we call consciousness. But I think it’s awfully hard to define that term, and so I would be hesitant to put that as a prerequisite on intentionality.
November 17th, 2017, by Tim Finin, posted in Data Science, Earth science, KR, Machine Learning, NLP
Discovering Scientific Influence using Cross-Domain Dynamic Topic Modeling
We describe an approach using dynamic topic modeling to model influence and predict future trends in a scientific discipline. Our study focuses on climate change and uses assessment reports of the Intergovernmental Panel on Climate Change (IPCC) and the papers they cite. Since 1990, an IPCC report has been published every five years that includes four separate volumes, each of which has many chapters. Each report cites tens of thousands of research papers, which comprise a correlated dataset of temporally grounded documents. We use a custom dynamic topic modeling algorithm to generate topics for both datasets and apply crossdomain analytics to identify the correlations between the IPCC chapters and their cited documents. The approach reveals both the influence of the cited research on the reports and how previous research citations have evolved over time. For the IPCC use case, the report topic model used 410 documents and a vocabulary of 5911 terms while the citations topic model was based on 200K research papers and a vocabulary more than 25K terms. We show that our approach can predict the importance of its extracted topics on future IPCC assessments through the use of cross domain correlations, Jensen-Shannon divergences and cluster analytics.
November 14th, 2017, by Tim Finin, posted in AI, Data Science, Machine Learning, talks
ACM Tech Talk Series
A Practitioner’s Introduction to Deep Learning
Ashwin Kumar Ganesan, PhD student
1:00-2:00pm Friday, 17 November 2017?, ITE325, UMBC
In recent years, Deep Neural Networks have been highly successful at performing a number of tasks in computer vision, natural language processing and artificial intelligence in general. The remarkable performance gains have led to universities and industries investing heavily in this space. This investment creates a thriving open source ecosystem of tools & libraries that aid the design of new architectures, algorithm research as well as data collection.
This talk (and hands-on session) introduce people to some of the basics of machine learning, neural networks and discusses some of the popular neural network architectures. We take a dive into one of the popular libraries, Tensorflow, and an associated abstraction library Keras.
To participate in the hands-on aspects of the workshop, bring a laptop computer with Python installed and install the following libraries using pip. For windows or (any other OS) consider doing an installation of anaconda that has all the necessary libraries.
- numpy, scipy & scikit-learn
- tensorflow / tensoflow-gpu (The first one is the GPU version)
- matplotlib for visualizations (if necessary)
- jupyter & ipython (We will use python2.7 in our experiments)
Following are helpful links:
Contact Nisha Pillai (NPillai1 at umbc.edu) with any questions regarding this event.
October 26th, 2017, by Tim Finin, posted in KR, Ontologies, OWL, Semantic Web
W3C Recommendation: Time Ontology in OWL
The Spatial Data on the Web Working Group has published a W3C Recommendation of the Time Ontology in OWL specification. The ontology provides a vocabulary for expressing facts about relations among instants and intervals, together with information about durations, and about temporal position including date-time information. Time positions and durations may be expressed using either the conventional Gregorian calendar and clock, or using another temporal reference system such as Unix-time, geologic time, or different calendars.
October 22nd, 2017, by Tim Finin, posted in AI, cybersecurity, Data Science, Ebiquity, Machine Learning, meetings, talks
Multi-observable Session Reputation Scoring System
11:00-12:00 Monday, 23 October 2017, ITE 346
With increasing adoption of Cloud Computing, cyber attacks have become one of the most effective means for adversaries to inflict damage. To overcome limitations of existing blacklists and whitelists, our research focuses to develop a dynamic reputation scoring model for sessions based on a variety of observable and derived attributes of network traffic. Here we propose a technique to greylist sessions using observables like IP, Domain, URL and File Hash by scoring them numerically based on the events in the session. This enables automatic labeling of possible malicious hosts or users that can help in enriching the existing whitelists or blacklists.
September 12th, 2017, by Tim Finin, posted in events, KR, Ontologies, Semantic Web
2018 Ontology Summit: Ontologies in Context
The OntologySummit is an annual series of online and in-person events that involves the ontology community and communities related to each year’s topic. The topic chosen for the 2018 Ontology Summit will be Ontologies in Context, which the summit describes as follows.
“In general, a context is defined to be the circumstances that form the setting for an event, statement, or idea, and in terms of which it can be fully understood and assessed. Some examples of synonyms include circumstances, conditions, factors, state of affairs, situation, background, scene, setting, and frame of reference. There are many meanings of “context” in general, and also for ontologies in particular. The summit this year will survey these meanings and identify the research problems that must be solved so that contexts can succeed in achieving the full understanding and assessment of an ontology.”
Each year’s Summit comprises of a series of both online and face-to-face events that span about three months. These include a vigorous three-month online discourse on the theme, and online panel discussions, research activities which will culminate in a two-day face-to-face workshop and symposium.
Over the next two months, there will be a sequence of weekly online meetings to discuss, plan and develop the 2018 topic. The summit itself will start in January with weekly online sessions of invited speakers. Visit the the 2018 Ontology Summit site for more information and to see how you can participate in the planning sessions.
September 10th, 2017, by Tim Finin, posted in cybersecurity, Machine Learning, Mobile Computing, Ontologies, Semantic Web
Context-Dependent Privacy and Security Management on Mobile Devices
There are ongoing security and privacy concerns regarding mobile platforms that are being used by a growing number of citizens. Security and privacy models typically used by mobile platforms use one-time permission acquisition mechanisms. However, modifying access rights after initial authorization in mobile systems is often too tedious and complicated for users. User studies show that a typical user does not understand permissions requested by applications or are too eager to use the applications to care to understand the permission implications. For example, the Brightest Flashlight application was reported to have logged precise locations and unique user identifiers, which have nothing to do with a flashlight application’s intended functionality, but more than 50 million users used a version of this application which would have forced them to allow this permission. Given the penetration of mobile devices into our lives, a fine-grained context-dependent security and privacy control approach needs to be created.
We have created Mithril as an end-to-end mobile access control framework that allows us to capture access control needs for specific users, by observing violations of known policies. The framework studies mobile application executables to better inform users of the risks associated with using certain applications. The policy capture process involves an iterative user feedback process that captures policy modifications required to mediate observed violations. Precision of policy is used to determine convergence of the policy capture process. Policy rules in the system are written using Semantic Web technologies and the Platys ontology to define a hierarchical notion of context. Policy rule antecedents are comprised of context elements derived using the Platys ontology employing a query engine, an inference mechanism and mobile sensors. We performed a user study that proves the feasibility of using our violation driven policy capture process to gather user-specific policy modifications.
We contribute to the static and dynamic study of mobile applications by defining “application behavior” as a possible way of understanding mobile applications and creating access control policies for them. Our user study also shows that unlike our behavior-based policy, a “deny by default” mechanism hampers usability of access control systems. We also show that inclusion of crowd-sourced policies leads to further reduction in user burden and need for engagement while capturing context-based access control policy. We enrich knowledge about mobile “application behavior” and expose this knowledge through the Mobipedia knowledge-base. We also extend context synthesis for semantic presence detection on mobile devices by combining Bluetooth, low energy beacons and Nearby Messaging services from Google.
September 5th, 2017, by Tim Finin, posted in KR, Machine Learning, NLP
Cognitive Assistance for Automating the Analysis of the Federal Acquisition Regulations System
Government regulations are critical to understanding how to do business with a government entity and receive other bene?ts. However, government regulations are also notoriously long and organized in ways that can be confusing for novice users. Developing cognitive assistance tools that remove some of the burden from human users is of potential bene?t to a variety of users. The volume of data found in United States federal government regulation suggests a multiple-step approach to process the data into machine readable text, create an automated legal knowledge base capturing various facts and rules, and eventually building a legal question and answer system to acquire understanding from various regulations and provisions. Our work discussed in this paper represents our initial efforts to build a framework for Federal Acquisition Regulations System (Title 48, Code of Federal Regulations) in order to create an efficient legal knowledge base representing relationships between various legal elements, semantically similar terminologies, deontic expressions and cross-referenced legal facts and rules.
September 2nd, 2017, by Tim Finin, posted in KR
Generating Digital Twin models using Knowledge Graphs for Industrial Production Lines
Digital Twin models are computerized clones of physical assets that can be used for in-depth analysis. Industrial production lines tend to have multiple sensors to generate near real-time status information for production. Industrial Internet of Things datasets are difficult to analyze and infer valuable insights such as points of failure, estimated overhead. etc. In this paper we introduce a simple way of formalizing knowledge as digital twin models coming from sensors in industrial production lines. We present a way on to extract and infer knowledge from large scale production line data, and enhance manufacturing process management with reasoning capabilities, by introducing a semantic query mechanism. Our system primarily utilizes a graph-based query language equivalent to conjunctive queries and has been enriched with inference rules.
July 16th, 2017, by Tim Finin, posted in Data Science, Machine Learning, NLP, Semantic Web
Deep Representation of Lyrical Style and Semantics for Music Recommendation
Abhay L. Kashyap
11:00-1:00 Thursday, 20 July 2017, ITE 346
In the age of music streaming, the need for effective recommendations is important for music discovery and a personalized user experience. Collaborative filtering based recommenders suffer from popularity bias and cold-start which is commonly mitigated by content features. For music, research in content based methods have mainly been focused in the acoustic domain while lyrical content has received little attention. Lyrics contain information about a song’s topic and sentiment that cannot be easily extracted from the audio. This is especially important for lyrics-centric genres like Rap, which was the most streamed genre in 2016. The goal of this dissertation is to explore and evaluate different lyrical content features that could be useful for content, context and emotion based models for music recommendation systems.
With Rap as the primary use case, this dissertation focuses on featurizing two main aspects of lyrics; its artistic style of composition and its semantic content. For lyrical style, a suite of high level rhyme density features are extracted in addition to literary features like the use of figurative language, profanity and vocabulary strength. In contrast to these engineered features, Convolutional Neural Networks (CNN) are used to automatically learn rhyme patterns and other relevant features. For semantics, lyrics are represented using both traditional IR techniques and the more recent neural embedding methods.
These lyrical features are evaluated for artist identification and compared with artist and song similarity measures from a real-world collaborative filtering based recommendation system from Last.fm. It is shown that both rhyme and literary features serve as strong indicators to characterize artists with feature learning methods like CNNs achieving comparable results. For artist and song similarity, a strong relationship was observed between these features and the way users consume music while neural embedding methods significantly outperformed LSA. Finally, this work is accompanied by a web-application, Rapalytics.com, that is dedicated to visualizing all these lyrical features and has been featured on a number of media outlets, most notably, Vox, attn: and Metro.
Committee: Drs. Tim Finin (chair), Anupam Joshi, Tim Oates, Cynthia Matuszek and Pranam Kolari (Walmart Labs)