UMBC ebiquity
Tim Finin

Author Archive

paper: Cleaning Noisy Knowledge Graphs

January 27th, 2018, by Tim Finin, posted in KR, Machine Learning, NLP, Ontologies

Cleaning Noisy Knowledge Graphs

Ankur Padia, Cleaning Noisy Knowledge Graphs, Proceedings of the Doctoral Consortium at the 16th International Semantic Web Conference, October 2017.

My dissertation research is developing an approach to identify and explain errors in a knowledge graph constructed by extracting entities and relations from text. Information extraction systems can automatically construct knowledge graphs from a large collection of documents, which might be drawn from news articles, Web pages, social media posts or discussion forums. The language understanding task is challenging and current extraction systems introduce many kinds of errors. Previous work on improving the quality of knowledge graphs uses additional evidence from background knowledge bases or Web searches. Such approaches are diffuclt to apply when emerging entities are present and/or only one knowledge graph is available. In order to address the problem I am using multiple complementary techniques including entitylinking, common sense reasoning, and linguistic analysis.

 

Videos of ISWC 2017 talks

December 16th, 2017, by Tim Finin, posted in iswc, Semantic Web

Videos of almost all of the talks from the 16th International Semantic Web Conference (ISWC) held in Vienna in 2017 are online at videolectures.net. They include 89 research presentations, two keynote talks, the one-minute madness event and the opening and closing ceremonies.

Jennifer Sleeman receives AI for Earth grant from Microsoft

December 12th, 2017, by Tim Finin, posted in AI, Earth science, Machine Learning, NLP, NLP

Jennifer Sleeman receives AI for Earth grant from Microsoft

Visiting Assistant Professor Jennifer Sleeman (Ph.D. ’17)  has been awarded a grant from Microsoft as part of its ‘AI for Earth’ program. Dr. Sleeman will use the grant to continue her research on developing algorithms to model how scientific disciplines such as climate change evolve and predict future trends by analyzing the text of articles and reports and the papers they cite.

AI for Earth is a Microsoft program aimed at empowering people and organizations to solve global environmental challenges by increasing access to AI tools and educational opportunities, while accelerating innovation. Via the Azure for Research AI for Earth award program, Microsoft provides selected researchers and organizations access to its cloud and AI computing resources to accelerate, improve and expand work on climate change, agriculture, biodiversity and/or water challenges.

UMBC is among the first grant recipients of AI for Earth, first launched in July 2017. The grant process was a competitive and selective process and was awarded in recognition of the potential of the work and power of AI to accelerate progress.

As part of her dissertation research, Dr. Sleeman developed algorithms using dynamic topic modeling to understand influence and predict future trends in a scientific discipline. She applied this to the field of climate change and used assessment reports of the Intergovernmental Panel on Climate Change (IPCC) and the papers they cite. Since 1990, an IPCC report has been published every five years that includes four separate volumes, each of which has many chapters. Each report cites tens of thousands of research papers, which comprise a correlated dataset of temporally grounded documents. Her custom dynamic topic modeling algorithm identified topics for both datasets and apply cross-domain analytics to identify the correlations between the IPCC chapters and their cited documents. The approach reveals both the influence of the cited research on the reports and how previous research citations have evolved over time.

Dr. Sleeman’s award is part of an inaugural set of 35 grants in more than ten countries for access to Microsoft Azure and AI technology platforms, services and training.  In an post on Monday, AI for Earth can be a game-changer for our planet, Microsoft announced its intent to put $50 million over five years into the program, enabling grant-making and educational trainings possible at a much larger scale.

More information about AI for Earth can be found on the Microsoft AI for Earth website.

Link Before You Share: Managing Privacy Policies through Blockchain

December 4th, 2017, by Tim Finin, posted in Blockchain, Policy, Privacy

Link Before You Share: Managing Privacy Policies through Blockchain

Agniva Banerjee, and Karuna Pande Joshi, Link Before You Share: Managing Privacy Policies through Blockchain, 4th International Workshop on Privacy and Security of Big Data (PSBD 2017), in conjunction with 2017 IEEE International Conference on Big Data, 4 December 2017.

With the advent of numerous online content providers, utilities and applications, each with their own specific version of privacy policies and its associated overhead, it is becoming increasingly difficult for concerned users to manage and track the confidential information that they share with the providers. We have developed a novel framework to automatically track details about how a user’s PII is stored, used and shared by the provider. We have integrated our data privacy ontology with the properties of blockchain, to develop an automated access-control and audit mechanism that enforces users’ data privacy policies when sharing their data across third parties. We have also validated this framework by implementing a working system LinkShare. In this paper, we describe our framework on detail along with the LinkShare system. Our approach can be adopted by big data users to automatically apply their privacy policy on data operations and track the flow of that data across various stakeholders.

paper: Automated Knowledge Extraction from the Federal Acquisition Regulations System

November 28th, 2017, by Tim Finin, posted in NLP, OWL, Semantic Web

Automated Knowledge Extraction from the Federal Acquisition Regulations System (FARS)

Srishty Saha and Karuna Pande Joshi, Automated Knowledge Extraction from the Federal Acquisition Regulations System (FARS), 2nd International Workshop on Enterprise Big Data Semantic and Analytics Modeling, IEEE Big Data Conference, December 2017.

With increasing regulation of Big Data, it is becoming essential for organizations to ensure compliance with various data protection standards. The Federal Acquisition Regulations System (FARS) within the Code of Federal Regulations (CFR) includes facts and rules for individuals and organizations seeking to do business with the US Federal government. Parsing and gathering knowledge from such lengthy regulation documents is currently done manually and is time and human intensive.Hence, developing a cognitive assistant for automated analysis of such legal documents has become a necessity. We have developed semantically rich approach to automate the analysis of legal documents and have implemented a system to capture various facts and rules contributing towards building an ef?cient legal knowledge base that contains details of the relationships between various legal elements, semantically similar terminologies, deontic expressions and cross-referenced legal facts and rules. In this paper, we describe our framework along with the results of automating knowledge extraction from the FARS document (Title48, CFR). Our approach can be used by Big Data Users to automate knowledge extraction from Large Legal documents.

Voices in AI, Episode 20: A Conversation with Marie desJardins

November 20th, 2017, by Tim Finin, posted in AI

Voices in AI – Episode 20: A Conversation with Marie desJardins

Byron Reese interviewed UMBC CSEE Professor Marie desJardins as part of his Voices in AI podcast series on Gigaom. In the episode, they talk about the Turing test, Watson, autonomous vehicles, and language processing.  Visit the Voices in AI site to listen to the podcast and read the interview transcript.

Here’s the start of the wide-ranging, hour-long interview.

Byron Reese: This is Voices in AI, brought to you by Gigaom. I’m Byron Reese. Today I’m excited that our guest is Marie des Jardins. She is an Associate Dean for Engineering and Information Technology as well as a professor of Computer Science at the University of Maryland, Baltimore County. She got her undergrad degree from Harvard, and a Ph.D. in computer science from Berkeley, and she’s been involved in the National Conference of the Association for the Advancement of Artificial Intelligence for over 12 years. Welcome to the show, Marie.

Marie des Jardins: Hi, it’s nice to be here.

I often open the show with “What is artificial intelligence?” because, interestingly, there’s no consensus definition of it, and I get a different kind of view of it from everybody. So I’ll start with that. What is artificial intelligence?

Sure. I’ve always thought about artificial intelligence as just a very broad term referring to trying to get computers to do things that we would consider intelligent if people did them. What’s interesting about that definition is it’s a moving target, because we change our opinions over time about what’s intelligent. As computers get better at doing things, they no longer seem that intelligent to us.

We use the word “intelligent,” too, and I’m not going to dwell on definitions, but what do you think intelligence is at its core?

So, it’s definitely hard to pin down, but I think of it as activities that human beings carry out, that we don’t know of lower order animals doing, other than some of the higher primates who can do things that seem intelligent to us. So intelligence involves intentionality, which means setting goals and making active plans to carry them out, and it involves learning over time and being able to react to situations differently based on experiences and knowledge that we’ve gained over time. The third part, I would argue, is that intelligence includes communication, so the ability to communicate with other beings, other intelligent agents, about your activities and goals.

Well, that’s really useful and specific. Let’s look at some of those things in detail a little bit. You mentioned intentionality. Do you think that intentionality is driven by consciousness? I mean, can you have intentionality without consciousness? Is consciousness therefore a requisite for intelligence?

I think that’s a really interesting question. I would decline to answer it mainly because I don’t think we ever can really know what consciousness is. We all have a sense of being conscious inside our own brains—at least I believe that. But of course, I’m only able to say anything meaningful about my own sense of consciousness. We just don’t have any way to measure consciousness or even really define what it is. So, there does seem to be this idea of self-awareness that we see in various kinds of animals—including humans—and that seems to be a precursor to what we call consciousness. But I think it’s awfully hard to define that term, and so I would be hesitant to put that as a prerequisite on intentionality.

MS defense: Internal Penetration Test of a Simulated Automotive Ethernet Environment, 11/21

November 18th, 2017, by Tim Finin, posted in cybersecurity, Data Science, Security

M.S. Thesis Defense

Internal Penetration Test of a Simulated Automotive Ethernet Environment

Kenneth Owen Truex

11:15 Tuesday, 21 November 2017, ITE325, UMBC

The capabilities of modern day automobiles have far exceeded what Robert Bosch GmbH could have imagined when it proposed the Controller Area Network (CAN) bus back in 1986. Over time, drivers wanted more functionality, comfort, and safety in their automobiles — creating a burden for automotive manufacturers. With these driver demands came many innovations to the in-vehicle network core protocol. Modern automobiles that have a video based infotainment system or any type of camera assisted functionality such as an Advanced Driver Assistance System (ADAS) use ethernet as their network backbone. This is because the original CAN specification only allowed for up to 8 bytes of data per message on a bus rated at 1 Mbps. This is far less than the requirements of more advanced video-based automotive systems. The ethernet protocol allows for 1500 bytes of data per packet on a network rated for up to 100 Mbps. This led the automotive industry to adopt ethernet as the core protocol, overcoming most of the limitations posed by the CAN protocol. By adopting ethernet as the protocol for automotive networks, certain attack vectors are now available for black hat hackers to exploit in order to put the vehicle in an unsafe condition. I will create a simulated automotive ethernet environment using the CANoe network simulation platform by Vector GmbH. Then, a penetration test will be conducted on the simulated environment in order to discover attacks that pose a threat to automotive ethernet networks. These attacks will strictly follow a comprehensive threat model in order to narrowly focus the attack surface. If exploited successfully, these attacks will cover all three sides of the Confidentiality, Integrity, Availability (CIA) triad.

I will then propose a new and innovative mitigation strategy that can be implemented on current industry standard ECUs and run successfully under strict time and resource limitations. This new strategy can help to limit the attack surface that exists on modern day automobiles and help to protect the vehicle and its occupants from malicious adversaries.

Committee: Drs. Anupam Joshi (chair), Richard Forno, Charles Nicholas, Nilanjan Banerjee

New paper: Cybersecurity Challenges to American Local Governments

November 18th, 2017, by Tim Finin, posted in Paper, Security

Cybersecurity Challenges to American Local Governments

Donald F. Norris, Laura Mateczun, Anupam Joshi and Tim Finin, Cybersecurity Challenges to American Local Governments, 17th European Conf. on Digital Government, pp 110-117, June 2017.

In this paper we examine data from the first ever nationwide survey of cybersecurity among American local governments. We are particularly interested in understanding the threats to local government cybersecurity, their level of preparedness to address the threats, the barriers these governments encounter when deploying cybersecurity, the policies, tools and practices that they employ to improve cybersecurity and, finally, the extent of awareness of and support for high levels of cybersecurity within their organizations. We found that local governments are under fairly constant cyberattack and are periodically breached. They are not especially well prepared to prevent cyberattacks or to recover when breached. The principal barriers to local cybersecurity are financial and organizations. Although a number of policies, tools and practices to improve cybersecurity, few local governments are making wide use of them. Last, local governments suffer from too little awareness of and support for cybersecurity within their organizations.

new paper: Discovering Scientific Influence using Cross-Domain Dynamic Topic Modeling

November 17th, 2017, by Tim Finin, posted in Data Science, Earth science, KR, Machine Learning, NLP

Discovering Scientific Influence using Cross-Domain Dynamic Topic Modeling

Jennifer Sleeman, Milton Halem, Tim Finin and Mark Cane, Discovering Scientific Influence using Cross-Domain Dynamic Topic Modeling, International Conference on Big Data, IEEE, December 2017.

We describe an approach using dynamic topic modeling to model influence and predict future trends in a scientific discipline. Our study focuses on climate change and uses assessment reports of the Intergovernmental Panel on Climate Change (IPCC) and the papers they cite. Since 1990, an IPCC report has been published every five years that includes four separate volumes, each of which has many chapters. Each report cites tens of thousands of research papers, which comprise a correlated dataset of temporally grounded documents. We use a custom dynamic topic modeling algorithm to generate topics for both datasets and apply crossdomain analytics to identify the correlations between the IPCC chapters and their cited documents. The approach reveals both the influence of the cited research on the reports and how previous research citations have evolved over time. For the IPCC use case, the report topic model used 410 documents and a vocabulary of 5911 terms while the citations topic model was based on 200K research papers and a vocabulary more than 25K terms. We show that our approach can predict the importance of its extracted topics on future IPCC assessments through the use of cross domain correlations, Jensen-Shannon divergences and cluster analytics.

A Practitioners Introduction to Deep Learning, 1pm Fri 11/17

November 14th, 2017, by Tim Finin, posted in AI, Data Science, Machine Learning, talks

ACM Tech Talk Series

A Practitioner’s Introduction to Deep Learning

Ashwin Kumar Ganesan, PhD student

1:00-2:00pm Friday, 17 November 2017?, ITE325, UMBC

In recent years, Deep Neural Networks have been highly successful at performing a number of tasks in computer vision, natural language processing and artificial intelligence in general. The remarkable performance gains have led to universities and industries investing heavily in this space. This investment creates a thriving open source ecosystem of tools & libraries that aid the design of new architectures, algorithm research as well as data collection.

This talk (and hands-on session) introduce people to some of the basics of machine learning, neural networks and discusses some of the popular neural network architectures. We take a dive into one of the popular libraries, Tensorflow, and an associated abstraction library Keras.

To participate in the hands-on aspects of the workshop, bring a laptop computer with Python installed and install the following libraries using pip.  For windows or (any other OS) consider doing an installation of anaconda that has all the necessary libraries.

  • numpy, scipy & scikit-learn
  • tensorflow / tensoflow-gpu (The first one is the GPU version)
  • matplotlib for visualizations (if necessary)
  • jupyter & ipython (We will use python2.7 in our experiments)

Following are helpful links:

Contact Nisha Pillai (NPillai1 at umbc.edu) with any questions regarding this event.

UMBC to upgrade High Performance Computing Facility with new NSF MRI grant

November 6th, 2017, by Tim Finin, posted in GENERAL, UMBC

 

UMBC upgrades High Performance Computing Facility with new NSF grant

 

The National Science Foundation recently awarded UMBC a Major Research Instrumentation (MRI) award totaling more than $550,000 to expand the university’s High Performance Computing Facility (HPCF). The funding will go toward upgraded hardware and increased computing speeds for the interdisciplinary core facility, which supports scientific computing and other complex, data-intensive research across disciplines, university-wide. As part of the NSF grant, UMBC is required to contribute 30 percent of the amount that NSF is providing to further support the project, meaning a total new investment of more than $780,000 in UMBC’s High Performance Community Facility.

Meilin Yu, assistant professor of mechanical engineering, is the principal investigator on the grant. He replaced Matthias Gobbert, professor of mathematics, who served as principal investigator on previous grants for the core facility in 2008, 2012 and 2017 on behalf of the 51 faculty investigators from academic departments and research centers across all three colleges. Co-Principal investigators on the grant are Professors Marc Olano, Jianwu Wang and Daniel Lobo.

Adapted for a UMBC news article by Megan Hanks

W3C Recommendation: Time Ontology in OWL

October 26th, 2017, by Tim Finin, posted in KR, Ontologies, OWL, Semantic Web

W3C Recommendation: Time Ontology in OWL

The Spatial Data on the Web Working Group has published a W3C Recommendation of the Time Ontology in OWL specification. The ontology provides a vocabulary for expressing facts about  relations among instants and intervals, together with information about durations, and about temporal position including date-time information. Time positions and durations may be expressed using either the conventional Gregorian calendar and clock, or using another temporal reference system such as Unix-time, geologic time, or different calendars.