January 27th, 2018, by Tim Finin, posted in KR, Machine Learning, NLP, Ontologies
Cleaning Noisy Knowledge Graphs
My dissertation research is developing an approach to identify and explain errors in a knowledge graph constructed by extracting entities and relations from text. Information extraction systems can automatically construct knowledge graphs from a large collection of documents, which might be drawn from news articles, Web pages, social media posts or discussion forums. The language understanding task is challenging and current extraction systems introduce many kinds of errors. Previous work on improving the quality of knowledge graphs uses additional evidence from background knowledge bases or Web searches. Such approaches are diffuclt to apply when emerging entities are present and/or only one knowledge graph is available. In order to address the problem I am using multiple complementary techniques including entitylinking, common sense reasoning, and linguistic analysis.
December 16th, 2017, by Tim Finin, posted in iswc, Semantic Web
Videos of almost all of the talks from the 16th International Semantic Web Conference (ISWC) held in Vienna in 2017 are online at videolectures.net. They include 89 research presentations, two keynote talks, the one-minute madness event and the opening and closing ceremonies.
December 12th, 2017, by Tim Finin, posted in AI, Earth science, Machine Learning, NLP, NLP
Jennifer Sleeman receives AI for Earth grant from Microsoft
Visiting Assistant Professor Jennifer Sleeman (Ph.D. ’17) has been awarded a grant from Microsoft as part of its ‘AI for Earth’ program. Dr. Sleeman will use the grant to continue her research on developing algorithms to model how scientific disciplines such as climate change evolve and predict future trends by analyzing the text of articles and reports and the papers they cite.
AI for Earth is a Microsoft program aimed at empowering people and organizations to solve global environmental challenges by increasing access to AI tools and educational opportunities, while accelerating innovation. Via the Azure for Research AI for Earth award program, Microsoft provides selected researchers and organizations access to its cloud and AI computing resources to accelerate, improve and expand work on climate change, agriculture, biodiversity and/or water challenges.
UMBC is among the first grant recipients of AI for Earth, first launched in July 2017. The grant process was a competitive and selective process and was awarded in recognition of the potential of the work and power of AI to accelerate progress.
As part of her dissertation research, Dr. Sleeman developed algorithms using dynamic topic modeling to understand influence and predict future trends in a scientific discipline. She applied this to the field of climate change and used assessment reports of the Intergovernmental Panel on Climate Change (IPCC) and the papers they cite. Since 1990, an IPCC report has been published every five years that includes four separate volumes, each of which has many chapters. Each report cites tens of thousands of research papers, which comprise a correlated dataset of temporally grounded documents. Her custom dynamic topic modeling algorithm identified topics for both datasets and apply cross-domain analytics to identify the correlations between the IPCC chapters and their cited documents. The approach reveals both the influence of the cited research on the reports and how previous research citations have evolved over time.
Dr. Sleeman’s award is part of an inaugural set of 35 grants in more than ten countries for access to Microsoft Azure and AI technology platforms, services and training. In an post on Monday, AI for Earth can be a game-changer for our planet, Microsoft announced its intent to put $50 million over five years into the program, enabling grant-making and educational trainings possible at a much larger scale.
More information about AI for Earth can be found on the Microsoft AI for Earth website.
December 4th, 2017, by Tim Finin, posted in Blockchain, Policy, Privacy
Link Before You Share: Managing Privacy Policies through Blockchain
November 28th, 2017, by Tim Finin, posted in NLP, OWL, Semantic Web
Automated Knowledge Extraction from the Federal Acquisition Regulations System (FARS)
With increasing regulation of Big Data, it is becoming essential for organizations to ensure compliance with various data protection standards. The Federal Acquisition Regulations System (FARS) within the Code of Federal Regulations (CFR) includes facts and rules for individuals and organizations seeking to do business with the US Federal government. Parsing and gathering knowledge from such lengthy regulation documents is currently done manually and is time and human intensive.Hence, developing a cognitive assistant for automated analysis of such legal documents has become a necessity. We have developed semantically rich approach to automate the analysis of legal documents and have implemented a system to capture various facts and rules contributing towards building an ef?cient legal knowledge base that contains details of the relationships between various legal elements, semantically similar terminologies, deontic expressions and cross-referenced legal facts and rules. In this paper, we describe our framework along with the results of automating knowledge extraction from the FARS document (Title48, CFR). Our approach can be used by Big Data Users to automate knowledge extraction from Large Legal documents.
November 20th, 2017, by Tim Finin, posted in AI
Voices in AI – Episode 20: A Conversation with Marie desJardins
Byron Reese interviewed UMBC CSEE Professor Marie desJardins as part of his Voices in AI podcast series on Gigaom. In the episode, they talk about the Turing test, Watson, autonomous vehicles, and language processing. Visit the Voices in AI site to listen to the podcast and read the interview transcript.
Here’s the start of the wide-ranging, hour-long interview.
Byron Reese: This is Voices in AI, brought to you by Gigaom. I’m Byron Reese. Today I’m excited that our guest is Marie des Jardins. She is an Associate Dean for Engineering and Information Technology as well as a professor of Computer Science at the University of Maryland, Baltimore County. She got her undergrad degree from Harvard, and a Ph.D. in computer science from Berkeley, and she’s been involved in the National Conference of the Association for the Advancement of Artificial Intelligence for over 12 years. Welcome to the show, Marie.
Marie des Jardins: Hi, it’s nice to be here.
I often open the show with “What is artificial intelligence?” because, interestingly, there’s no consensus definition of it, and I get a different kind of view of it from everybody. So I’ll start with that. What is artificial intelligence?
Sure. I’ve always thought about artificial intelligence as just a very broad term referring to trying to get computers to do things that we would consider intelligent if people did them. What’s interesting about that definition is it’s a moving target, because we change our opinions over time about what’s intelligent. As computers get better at doing things, they no longer seem that intelligent to us.
We use the word “intelligent,” too, and I’m not going to dwell on definitions, but what do you think intelligence is at its core?
So, it’s definitely hard to pin down, but I think of it as activities that human beings carry out, that we don’t know of lower order animals doing, other than some of the higher primates who can do things that seem intelligent to us. So intelligence involves intentionality, which means setting goals and making active plans to carry them out, and it involves learning over time and being able to react to situations differently based on experiences and knowledge that we’ve gained over time. The third part, I would argue, is that intelligence includes communication, so the ability to communicate with other beings, other intelligent agents, about your activities and goals.
Well, that’s really useful and specific. Let’s look at some of those things in detail a little bit. You mentioned intentionality. Do you think that intentionality is driven by consciousness? I mean, can you have intentionality without consciousness? Is consciousness therefore a requisite for intelligence?
I think that’s a really interesting question. I would decline to answer it mainly because I don’t think we ever can really know what consciousness is. We all have a sense of being conscious inside our own brains—at least I believe that. But of course, I’m only able to say anything meaningful about my own sense of consciousness. We just don’t have any way to measure consciousness or even really define what it is. So, there does seem to be this idea of self-awareness that we see in various kinds of animals—including humans—and that seems to be a precursor to what we call consciousness. But I think it’s awfully hard to define that term, and so I would be hesitant to put that as a prerequisite on intentionality.
November 18th, 2017, by Tim Finin, posted in cybersecurity, Data Science, Security
M.S. Thesis Defense
Internal Penetration Test of a Simulated Automotive Ethernet Environment
Kenneth Owen Truex
11:15 Tuesday, 21 November 2017, ITE325, UMBC
The capabilities of modern day automobiles have far exceeded what Robert Bosch GmbH could have imagined when it proposed the Controller Area Network (CAN) bus back in 1986. Over time, drivers wanted more functionality, comfort, and safety in their automobiles — creating a burden for automotive manufacturers. With these driver demands came many innovations to the in-vehicle network core protocol. Modern automobiles that have a video based infotainment system or any type of camera assisted functionality such as an Advanced Driver Assistance System (ADAS) use ethernet as their network backbone. This is because the original CAN specification only allowed for up to 8 bytes of data per message on a bus rated at 1 Mbps. This is far less than the requirements of more advanced video-based automotive systems. The ethernet protocol allows for 1500 bytes of data per packet on a network rated for up to 100 Mbps. This led the automotive industry to adopt ethernet as the core protocol, overcoming most of the limitations posed by the CAN protocol. By adopting ethernet as the protocol for automotive networks, certain attack vectors are now available for black hat hackers to exploit in order to put the vehicle in an unsafe condition. I will create a simulated automotive ethernet environment using the CANoe network simulation platform by Vector GmbH. Then, a penetration test will be conducted on the simulated environment in order to discover attacks that pose a threat to automotive ethernet networks. These attacks will strictly follow a comprehensive threat model in order to narrowly focus the attack surface. If exploited successfully, these attacks will cover all three sides of the Confidentiality, Integrity, Availability (CIA) triad.
I will then propose a new and innovative mitigation strategy that can be implemented on current industry standard ECUs and run successfully under strict time and resource limitations. This new strategy can help to limit the attack surface that exists on modern day automobiles and help to protect the vehicle and its occupants from malicious adversaries.
Committee: Drs. Anupam Joshi (chair), Richard Forno, Charles Nicholas, Nilanjan Banerjee
November 18th, 2017, by Tim Finin, posted in Paper, Security
Cybersecurity Challenges to American Local Governments
In this paper we examine data from the first ever nationwide survey of cybersecurity among American local governments. We are particularly interested in understanding the threats to local government cybersecurity, their level of preparedness to address the threats, the barriers these governments encounter when deploying cybersecurity, the policies, tools and practices that they employ to improve cybersecurity and, finally, the extent of awareness of and support for high levels of cybersecurity within their organizations. We found that local governments are under fairly constant cyberattack and are periodically breached. They are not especially well prepared to prevent cyberattacks or to recover when breached. The principal barriers to local cybersecurity are financial and organizations. Although a number of policies, tools and practices to improve cybersecurity, few local governments are making wide use of them. Last, local governments suffer from too little awareness of and support for cybersecurity within their organizations.
November 17th, 2017, by Tim Finin, posted in Data Science, Earth science, KR, Machine Learning, NLP
Discovering Scientific Influence using Cross-Domain Dynamic Topic Modeling
We describe an approach using dynamic topic modeling to model influence and predict future trends in a scientific discipline. Our study focuses on climate change and uses assessment reports of the Intergovernmental Panel on Climate Change (IPCC) and the papers they cite. Since 1990, an IPCC report has been published every five years that includes four separate volumes, each of which has many chapters. Each report cites tens of thousands of research papers, which comprise a correlated dataset of temporally grounded documents. We use a custom dynamic topic modeling algorithm to generate topics for both datasets and apply crossdomain analytics to identify the correlations between the IPCC chapters and their cited documents. The approach reveals both the influence of the cited research on the reports and how previous research citations have evolved over time. For the IPCC use case, the report topic model used 410 documents and a vocabulary of 5911 terms while the citations topic model was based on 200K research papers and a vocabulary more than 25K terms. We show that our approach can predict the importance of its extracted topics on future IPCC assessments through the use of cross domain correlations, Jensen-Shannon divergences and cluster analytics.
November 14th, 2017, by Tim Finin, posted in AI, Data Science, Machine Learning, talks
ACM Tech Talk Series
A Practitioner’s Introduction to Deep Learning
Ashwin Kumar Ganesan, PhD student
1:00-2:00pm Friday, 17 November 2017?, ITE325, UMBC
In recent years, Deep Neural Networks have been highly successful at performing a number of tasks in computer vision, natural language processing and artificial intelligence in general. The remarkable performance gains have led to universities and industries investing heavily in this space. This investment creates a thriving open source ecosystem of tools & libraries that aid the design of new architectures, algorithm research as well as data collection.
This talk (and hands-on session) introduce people to some of the basics of machine learning, neural networks and discusses some of the popular neural network architectures. We take a dive into one of the popular libraries, Tensorflow, and an associated abstraction library Keras.
To participate in the hands-on aspects of the workshop, bring a laptop computer with Python installed and install the following libraries using pip. For windows or (any other OS) consider doing an installation of anaconda that has all the necessary libraries.
- numpy, scipy & scikit-learn
- tensorflow / tensoflow-gpu (The first one is the GPU version)
- matplotlib for visualizations (if necessary)
- jupyter & ipython (We will use python2.7 in our experiments)
Following are helpful links:
Contact Nisha Pillai (NPillai1 at umbc.edu) with any questions regarding this event.
November 6th, 2017, by Tim Finin, posted in GENERAL, UMBC
UMBC upgrades High Performance Computing Facility with new NSF grant
The National Science Foundation recently awarded UMBC a Major Research Instrumentation (MRI) award totaling more than $550,000 to expand the university’s High Performance Computing Facility (HPCF). The funding will go toward upgraded hardware and increased computing speeds for the interdisciplinary core facility, which supports scientific computing and other complex, data-intensive research across disciplines, university-wide. As part of the NSF grant, UMBC is required to contribute 30 percent of the amount that NSF is providing to further support the project, meaning a total new investment of more than $780,000 in UMBC’s High Performance Community Facility.
Meilin Yu, assistant professor of mechanical engineering, is the principal investigator on the grant. He replaced Matthias Gobbert, professor of mathematics, who served as principal investigator on previous grants for the core facility in 2008, 2012 and 2017 on behalf of the 51 faculty investigators from academic departments and research centers across all three colleges. Co-Principal investigators on the grant are Professors Marc Olano, Jianwu Wang and Daniel Lobo.
Adapted for a UMBC news article by Megan Hanks
October 26th, 2017, by Tim Finin, posted in KR, Ontologies, OWL, Semantic Web
W3C Recommendation: Time Ontology in OWL
The Spatial Data on the Web Working Group has published a W3C Recommendation of the Time Ontology in OWL specification. The ontology provides a vocabulary for expressing facts about relations among instants and intervals, together with information about durations, and about temporal position including date-time information. Time positions and durations may be expressed using either the conventional Gregorian calendar and clock, or using another temporal reference system such as Unix-time, geologic time, or different calendars.