UMBC ebiquity
UMBC eBiquity Blog

Beyond NER: Towards Semantics in Clinical Text

Tim Finin, 9:14am 29 September 2015

Clare Grasso, Anupam Joshi and ELior Siegel, Beyond NER: Towards Semantics in Clinical Text, Biomedical Data Mining, Modeling, and Semantic Integration (BDM2I); co-located with the 14th International Semantic Web Conference (ISWC 2015), Bethlehem, PA.

While clinical text NLP systems have become very effective in recognizing named entities in clinical text and mapping them to standardized terminologies in the normalization process, there remains a gap in the ability of extractors to combine entities together into a complete semantic representation of medical concepts that contain multiple attributes each of which has its own set of allowed named entities or values. Furthermore, additional domain knowledge may be required to determine the semantics of particular tokens in the text that take on special meanings in relation to this concept. This research proposes an approach that provides ontological mappings of the surface forms of medical concepts that are of the UMLS semantic class signs/symptoms. The mappings are used to extract and encode the constituent set of named entities into interoperable semantic structures that can be linked to other structured and unstructured data for reuse in research and analysis.


talk: Is your personal data at risk? App analytics to the rescue

Tim Finin, 5:15pm 26 September 2015

Is your personal data at risk?
App analytics to the rescue

Prajit Kumar Das

10:30am Monday, 28 September 28 2015, ITE346

According to Virustotal, a prominent virus and malware tool, the Google Play Store has a few thousand apps from major malware families. Given such a revelation, access control systems for mobile data management, have reached a state of critical importance. We propose the development of a system which would help us detect the pathways using which user’s data is being stolen from their mobile devices. We use a multi layered approach which includes app meta data analysis, understanding code patterns and detecting and eventually controlling dynamic data flow when such an app is installed on a mobile device. In this presentation we focus on the first part of our work and discuss the merits and flaws of our unsupervised learning mechanism to detect possible malicious behavior from apps in the Google Play Store.


talk: Attribute-based Fine Grained Access Control for Triple Stores

Tim Finin, 8:01am 12 September 2015


In the 14-09-2015 ebiquity meeting, Ankur Padia will talk about his recent work aimed at providing access control for an RDF triple store.

Attribute-based Fine Grained Access Control for Triple Stores

Ankur Padia, UMBC

The maturation of semantic web standards and associated web-based data representations like have made RDF a popular model for representing graph data and semi-structured knowledge. However, most existing SPARQL endpoint supports simple access control mechanism preventing its use for many applications. To protect the data stored in RDF stores, we describe a framework to support attribute-based fine grained access control and explore its feasibility. We implemented a prototype of the system and used it to carry out an initial analysis on the relation between access control policies, query execution time, and size of the RDF dataset.

For more information, see: Ankur Padia Tim Finin and Anupam Joshi, Attribute-based Fine Grained Access Control for Triple Stores, 3rd Society, Privacy and the Semantic Web – Policy and Technology workshop (PrivOn 2015), 14th Int. Semantic Web Conf., Oct. 2015.


SVM-CASE: An SVM-based Context Aware Security Framework for Vehicular Ad-hoc Networks

Tim Finin, 6:18pm 1 September 2015


Wenjia Li, Anupam Joshi and Tim Finin, SVM-CASE: An SVM-based Context Aware Security Framework for Vehicular Ad-hoc Networks, IEEE 82nd Vehicular Technology Conf., Boston, Sept. 2015.

Vehicular Ad-hoc Networks (VANETs) are known to be very susceptible to various malicious attacks. To detect and mitigate these malicious attacks, many security mechanisms have been studied for VANETs. In this paper, we propose a context aware security framework for VANETs that uses the Support Vector Machine (SVM) algorithm to automatically determine the boundary between malicious nodes and normal ones. Compared to the existing security solutions for VANETs, The proposed framework is more resilient to context changes that are common in VANETs, such as those due to malicious nodes altering their attack patterns over time or rapid changes in environmental factors, such as the motion speed and transmission range. We compare our framework to existing approaches and present evaluation results obtained from simulation studies.


Hot Stuff at ColdStart

Tim Finin, 9:17pm 8 June 2015

Cold Start

Coldstart is a task in the NIST Text Analysis Conference’s Knowledge Base Population suite that combines entity linking and slot filling to populate an empty knowledge base using a predefined ontology for the facts and relations. This paper describes a system developed by the Human Language Technology Center of Excellence at Johns Hopkins University for the 2014 Coldstart task.

Tim Finin, Paul McNamee, Dawn Lawrie, James Mayfield and Craig Harman, Hot Stuff at Cold Start: HLTCOE participation at TAC 2014, 7th Text Analysis Conference, National Institute of Standards and Technology, Nov. 2014.

The JHU HLTCOE participated in the Cold Start task in this year’s Text Analysis Conference Knowledge Base Population evaluation. This is our third year of participation in the task, and we continued our research with the KELVIN system. We submitted experimental variants that explore use of forward-chaining inference, slightly more aggressive entity clustering, refined multiple within-document conference, and prioritization of relations extracted from news sources.


Platys: From Position to Place-Oriented Mobile Computing

Tim Finin, 7:48am 8 June 2015

The NSF-sponsored Platys project explored the idea that places are more than just GPS coordinates. They are concepts rich with semantic information, including people, activities, roles, functions, time and purpose. Our mobile phones can learn to recognize the places we are in and use information about them to provide better services.

Laura Zavala, Pradeep K. Murukannaiah, Nithyananthan Poosamani, Tim Finin, Anupam Joshi, Injong Rhee and Munindar P. Singh, Platys: From Position to Place-Oriented Mobile Computing, AI Magazine, v36, n2, 2015.

The Platys project focuses on developing a high-level, semantic notion of location called place. A place, unlike a geospatial position, derives its meaning from a user’s actions and interactions in addition to the physical location where it occurs. Our aim is to enable the construction of a large variety of applications that take advantage of place to render relevant content and functionality and, thus, improve user experience. We consider elements of context that are particularly related to mobile computing. The main problems we have addressed to realize our place-oriented mobile computing vision are representing places, recognizing places, and engineering place-aware applications. We describe the approaches we have developed for addressing these problems and related subproblems. A key element of our work is the use of collaborative information sharing where users’ devices share and integrate knowledge about places. Our place ontology facilitates such collaboration. Declarative privacy policies allow users to specify contextual features under which they prefer to share or not share their information.


UMBC Schema Free Query system on ESWC Schema-agnostic Queries over Linked Data

Tim Finin, 8:58am 7 June 2015

This year’s ESWC Semantic Web Evaluation Challenge track had a task on Schema-agnostic Queries over Linked Data: SAQ-2015. The idea is to support a SPARQL-like query language that does not require knowing the underlying graph schema nor the URIs to use for terms and individuals, as in the follwing examples.

 SELECT ?y {BillClinton hasDaughter ?x. ?x marriedTo ?y.}

 SELECT ?x {?x isA book. ?x by William_Goldman.
            ?x has_pages ?p. FILTER (?p > 300)}

We adapted our Schema Free Querying system to the task as described in the following paper.

Zareen Syed, Lushan Han, Muhammad Mahbubur Rahman, Tim Finin, James Kukla and Jeehye Yun, UMBC_Ebiquity-SFQ: Schema Free Querying System, ESWC Semantic Web Evaluation Challenge, Extended Semantic Web Conference, June 2015.

Users need better ways to explore large complex linked data resources. Using SPARQL requires not only mastering its syntax and semantics but also understanding the RDF data model, the ontology and URIs for entities of interest. Natural language question answering systems solve the problem, but these are still subjects of research. The Schema agnostic SPARQL queries task defined in SAQ-2015 challenge consists of schema-agnostic queries following the syntax of the SPARQL standard, where the syntax and semantics of operators are maintained, while users are free to choose words, phrases and entity names irrespective of the underlying schema or ontology. This combination of query skeleton with keywords helps to remove some of the ambiguity. We describe our framework for handling schema agnostic or schema free queries and discuss enhancements to handle the SAQ-2015 challenge queries. The key contributions are the robust methods that combine statistical association and semantic similarity to map user terms to the most appropriate classes and properties used in the underlying ontology and type inference for user input concepts based on concept linking.


Interactive Knowledge Base Population

Tim Finin, 3:19pm 6 June 2015

Travis Wolfe, Mark Dredze, James Mayfield, Paul McNamee, Craig Harman, Tim Finin and Benjamin Van Durme, Interactive Knowledge Base Population, arXiv:1506.00301 [cs.AI], May 2015.

Most work on building knowledge bases has focused on collecting entities and facts from as large a collection of documents as possible. We argue for and describe a new paradigm where the focus is on a high-recall extraction over a small collection of documents under the supervision of a human expert, that we call Interactive Knowledge Base Population (IKBP).


Querying RDF Data with Text Annotated Graphs

Tim Finin, 9:26am 6 June 2015

New paper: Lushan Han, Tim Finin, Anupam Joshi and Doreen Cheng, Querying RDF Data with Text Annotated Graphs, 27th International Conference on Scientific and Statistical Database Management, San Diego, June 2015.

Scientists and casual users need better ways to query RDF databases or Linked Open Data. Using the SPARQL query language requires not only mastering its syntax and semantics but also understanding the RDF data model, the ontology used, and URIs for entities of interest. Natural language query systems are a powerful approach, but current techniques are brittle in addressing the ambiguity and complexity of natural language and require expensive labor to supply the extensive domain knowledge they need. We introduce a compromise in which users give a graphical “skeleton” for a query and annotates it with freely chosen words, phrases and entity names. We describe a framework for interpreting these “schema-agnostic queries” over open domain RDF data that automatically translates them to SPARQL queries. The framework uses semantic textual similarity to find mapping candidates and uses statistical approaches to learn domain knowledge for disambiguation, thus avoiding expensive human efforts required by natural language interface systems. We demonstrate the feasibility of the approach with an implementation that performs well in an evaluation on DBpedia data.


Discovering and Querying Hybrid Linked Data

Tim Finin, 9:00am 5 June 2015


New paper: Zareen Syed, Tim Finin, Muhammad Rahman, James Kukla and Jeehye Yun, Discovering and Querying Hybrid Linked Data, Third Workshop on Knowledge Discovery and Data Mining Meets Linked Open Data, held in conjunction with the 12th Extended Semantic Web Conference, Portoroz Slovenia, June 2015.

In this paper, we present a unified framework for discovering and querying hybrid linked data. We describe our approach to developing a natural language query interface for a hybrid knowledge base Wikitology, and present that as a case study for accessing hybrid information sources with structured and unstructured data through natural language queries. We evaluate our system on a publicly available dataset and demonstrate improvements over a baseline system. We describe limitations of our approach and also discuss cases where our system can complement other structured data querying systems by retrieving additional answers not available in structured sources.


Initial impressions: Android M permissions

Prajit Kumar Das, 10:18am 29 May 2015

Google I/O 2015 was a very important day for privacy researchers. For the first time Google acknowledged a need for better privacy control. Researchers and Developers working with Android for sometime probably know that their was a feature called AppOps. This feature was introduced in Android 4.3 and later removed in 4.4.2. The reasons stated for its inclusion and removal have been discussed extensively. However, the only conclusion we could clearly draw from all the discussion was that there was a demand for such a feature. Our friends from over at Apple have repeatedly mentioned how Apple has always cared for User Privacy more than Google. As a result of this, it was only a matter of time and a pleasant development for Android enthusiasts to see this new feature in Android.

We installed the new Android M OS on a Nexus 5. The first thing we wanted to see was the permissions feature. Listed below are our impressions of what we thought of this new feature from a Privacy researcher’s perspective.

The feature is not easy to find
We had to weed through the settings of our phone and we were not able to find it straightaway. There was no menu item for Privacy. How do you access it then? You will have to click on the phone’s setting and then click on “Apps” and then select a particular app whose permission access you wish to control. Following this you will have to click on “Permissions” for that app. At this point you get the menu which allows you to toggle the permissions.

The Permission control is essentially useless till your Apps upgrade
Now, Google stated yesterday that the behavior of apps which do not upgrade to the new API version will remain the same as before. Therefore, even with this feature present you cannot actually stop an app from accessing the restricted data. What you do see is a warning dialog stating the obvious.

Warning message for apps using pre Android M SDK

Warning message for apps using pre Android M SDK

Not all permissions shows up in the list
The granularity of permissions that will be available in this new feature is still uncertain. If you check the Facebook permission list in the Google Play Store, you will see that it requests a lot of permissions.

Permissions description

Permissions description

Permissions description

Permissions description

Permissions description

Permissions description

Permissions description

Permissions description

But when you check out the permission control menu, you will see just a few of these permissions here.

App permissions list

App permissions list

We can assume that Google is grouping the permissions into logical groups. However, that means that the primary issue that a lot of researchers have raised about granular access control is still not being addressed by Google. We have been doing research with fine-grained permission control for sometime now. In our work, we have created a system that is capable of controlling the access to data on a mobile device based on the context of the user. Such an intelligent system would not only know what data to give access to but also when to do so. That goal still remains to be completely realized.

Obviously, we must not forget that Something is always better than nothing! Google is taking steps to improve the means by which it protects a user’s privacy and provides security. It is an iterative process and it’s still far from the goal. It is getting closer to that goal though.


talk: Amit Sheth on Transforming Big data into Smart Data, 11a Tue 5/26

Tim Finin, 2:43pm 17 May 2015

Transforming big data into smart data:
deriving value via harnessing volume, variety
and velocity using semantics and semantic web

Professor Amit Sheth
Wright State University

11:00am Tuesday, 26 May 2015, ITE 325, UMBC

Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. In this talk, I will describe Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If my child is an asthma patient, for all the data relevant to my child with the four V-challenges, what I care about is simply, "How is her current health, and what are the risk of having an asthma attack in her current situation (now and today), especially if that risk has changed?" As I will show, Smart Data that gives such personalized and actionable information will need to utilize multimodal data and their metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on Machine Learning and NLP. I will motivate the need for a synergistic combination of techniques similar to the close interworking of the top brain and the bottom brain in the cognitive models. I will present a couple of Smart Data applications in development at Kno.e.sis from the domains of personalized health, health informatics, social data for social good, energy, disaster response, and smart city.

Amit Sheth is an Educator, Researcher and Entrepreneur. He is the LexisNexis Ohio Eminent Scholar, an IEEE Fellow, and the executive director of Kno.e.sis – the Ohio Center of Excellence in Knowledge-enabled Computing a Wright State University. In World Wide Web (WWW), it is placed among the top ten universities in the world based on 10-year impact. Prof. Sheth is a well cited computer scientists (h-index = 87, >30,000 citations), and appears among top 1-3 authors in World Wide Web (Microsoft Academic Search). He has founded two companies, and several commercial products and deployed systems have resulted from his research. His students are exceptionally successful; ten out of 18 past PhD students have 1,000+ citations each.

Host: Yelena Yesha,