UMBC ebiquity
UMBC eBiquity Blog

Semantics for Privacy and Shared Context

Tim Finin, 12:01pm 15 December 2014

Roberto Yus, Primal Pappachan, Prajit Das, Tim Finin, Anupam Joshi, and Eduardo Mena, Semantics for Privacy and Shared Context, Workshop on Society, Privacy and the Semantic Web-Policy and Technology, held at Int. Semantic Web Conf., Oct. 2014.

Capturing, maintaining, and using context information helps mobile applications provide better services and generates data useful in specifying information sharing policies. Obtaining the full benefit of context information requires a rich and expressive representation that is grounded in shared semantic models. We summarize some of our past work on representing and using context models and briefly describe Triveni, a system for cross-device context discovery and enrichment. Triveni represents context in RDF and OWL and reasons over context models to infer additional information and detect and resolve ambiguities and inconsistencies. A unique feature, its ability to create and manage “contextual groups” of users in an environment, enables their members to share context information using wireless ad-hoc networks. Thus, it enriches the information about a user’s context by creating mobile ad hoc knowledge networks.


 

UMBC seeks nine new computing faculty

Tim Finin, 10:23am 13 December 2014

usnews_badge_100

UMBC has a total of nine open full-time positions for computing faculty including five tenure track professors, a professor of the practice and three lecturers.

UMBC’s Computer Science and Electrical Engineering department is seeking to fill five positions for the coming year. They include two tenure track positions in Computer Science, up to three full-time lecturers. See the CSEE jobs page for more information.

The College of Engineering and Information Technology has a position for a full-time lecturer or Professor of Practice to focus on the needs of incoming computing majors through teaching, advising, and helping develop programs in computing. This person will work closely with faculty in the Computer Science and Electrical Engineering Department and Information Systems Department.

UMBC’s Information Systems department is accepting applications for three tenure track faculty positions in data science, software engineering and human-centered computing.


 

Amir Karami on a fuzzy approach topic models for medical corpora

Tim Finin, 9:48am 2 December 2014

In this week’s Ebiquity meeting (10am Wed 12/3 in ITE346), Amir Karami will talk about “Fuzzy Approach Topic Models for Medical Corpus”.

Abstract: Looking for ways to automatically retrieve the enormous amount of medical knowledge has always been an intriguing topic. The massive flow of medical documents including scholarly publications and clinical notes has benefited experts by providing ease to access to a huge amount of text data. However, due to this amount of data, medical experts are finding it increasingly difficult locate information of interest. As a consequence, finding relevant documents has become more difficult. Effective text mining systems should be able to extract and exploit not only explicitly stated information but also implied and inferred data. Using bag-of-words leads to sparse high dimension problem that has low performance and needs more cost of computation. Dimension reduction techniques, specially topic models, are one of useful techniques to overcome the problems of bag-of-words. This research proposes a novel approach for topic modeling using fuzzy clustering. To evaluate our model, we experiment with two text datasets of medical documents. The evaluation metrics carried out through document classification, document modeling, and document clustering show that our approach produces better performance than LDA, the most-cited topic model article in Google scholar, indicating that fuzzy set theory can improve the performance of topic models in medical domain. Our approach solves redundancy issue in medical domain and can discover the relation between topics in a documents. In addition, the previous research of fuzzy clustering can help to solve the challenges of topic modeling such as defining the number of topics.


 

Anupam Joshi named an IEEE Fellow

Tim Finin, 9:38am 27 November 2014
Joshifinal

CSEE Professor Anupam Joshi has been named an IEEE Fellow, recognized for his for contributions to security, privacy and data management in mobile and pervasive systems. This designation is conferred by the IEEE Board of Directors on individuals with an outstanding record of accomplishments in any of the IEEE fields of interest and is recognized by the technical community as a prestigious honor and an important career achievement. No more than 0.1% of the total IEEE voting membership can be selected in a year.

Dr. Joshi joined UMBC’s faculty in 1998 and currently is the Director of the UMBC Center for Cybersecurity. He previousy held faculty appointments at the University of Missouri, Columbia and Purdue University. He received a Ph.D. in Computer Science from Purdue University and a B. Tech in Electrical Engineering from the Indian Institute of Technology, Delhi. While at UMBC he has taught both undergraduate and graduate courses in operating systems, mobile computing and security. He developed and teaches an Honors College seminar on “Privacy and Security in a Mobile social world”. He has mentored nine Ph.D. graduates and a large number of M.S. students.

Joshi has made many contributions to the design, analysis and development of intelligent systems for mobile, social and secure computing. Twenty years ago he was one of a handful of researchers who recognized that mobility introduced new challenges for data management, security and privacy over and above those brought about by wireless connectivity. His key insight was to model mobile and pervasive systems as distributed systems that are both open, in that they do not pre-identify a set of known participants, and dynamic, in that the participants change regularly.

He observe that applications on mobile devices require greater degrees of decision making and autonomy as they become increasingly sophisticated and intelligent and can’t always assume connectivity to central servers. Entities in these pervasive computing systems must exchange information about the data and services offered and sought and their associated security and privacy policies, negotiate for information and resource sharing, be aware of their context, and monitor for and report on suspicious or anomalous behavior. Dr. Joshi has addressed these challenges across the stack, from network protocols to data management to policy controlled interactions between autonomous entities.

Much of his research has been done in collaboration with colleagues in industry such as IBM, Microsoft, Northrop Grumman and Qualcomm. It has been funded by not just them, but also NSF, DARPA, AFOSR, ARL, NIST and other federal agencies. Joshi has published prolifically with more than 200 publications in refereed journals and conferences, many of which are highly cited. He has served as the General or Program Chair of many key conferences including the IEEE International Conference on Intelligence and Security Informatics which will be held in Baltimore in May 2015.

The IEEE is the world’s leading professional association for advancing technology for humanity. Through its 400,000 members in 160 countries, it is a leading authority on a wide variety of areas ranging from aerospace systems, computers and telecommunications to biomedical engineering, electric power and consumer electronics. Dedicated to the advancement of technology, the IEEE publishes 30 percent of the world’s literature in the electrical and electronics engineering and computer science fields, and has developed more than 900 active industry standards.


 

Wild Big Data

Tim Finin, 11:34pm 11 November 2014

In this week’s Ebiquity meeting (10am Wed Nov. 12), Jennifer Sleeman will talk about “Taming Wild Big Data”.

Wild Big Data is data that is hard to extract, understand, and use due to its heterogeneous nature and volume. It typically comes without a schema, is obtained from multiple sources and provides a challenge for information extraction and integration. We describe a way to subduing Wild Big Data that uses techniques and resources that are popular for processing natural language text. The approach is applicable to data that is presented as a graph of objects and relations between them and to tabular data that can be transformed into such a graph. We start by applying topic models to contextualize the data and then use the results to identify the potential types of the graph’s nodes by mapping them to known types found in large open ontologies such as Freebase, and DBpedia. The results allow us to assemble coarse clusters of objects that can then be used to interpret the link and perform entity disambiguation and record linking.


 

Wikidata article in CACM

Tim Finin, 7:51pm 12 October 2014

Wikidata-logo-en

I just noticed that Denny Vrandecic and Markus Krötzsch have an article on Wikidata in the latest CACM. Good work! Even better, it’s available without subscription.

Wikidata: a free collaborative knowledgebase, Denny Vrandecic and Markus Krötzsch, Communications of the ACM, v57, n10 (2014), pp 78-85.

“This collaboratively edited knowledgebase provides a common source of data for Wikipedia, and everyone else.

Unnoticed by most of its readers, Wikipedia continues to undergo dramatic changes, as its sister project Wikidata introduces a new multilingual “Wikipedia for data” (http://www.wikidata.org) to manage the factual information of the popular online encyclopedia. With Wikipedia’s data becoming cleaned and integrated in a single location, opportunities arise for many new applications.”


 

Responsive design with Twitter Bootstrap: a tutorial and demonstration

Tim Finin, 9:02am 11 October 2014

In this week’s ebiquity meeting (10am Wed. Oct. 15 in ITE346), Abhay Kashyap will give a tutorial and demonstration of responsive web design using Twitter Bootstrap. A draft of the slides is available.

Does your webpage look like its from the 90’s? Does it scale well on mobile devices or do you have to finger skate on your the mobile device to navigate the page? Or worse! Do you still not have a personal portfolio page? If the answer is yes, then its time create or update your webpage! With a front end framework like Twitter Bootstrap, you can make a quick upgrade to beautiful, modern and most importantly responsive design for your portfolio page.

The presentation will start with a brief overview of responsive design and Twitter Bootstrap and then continue with a live demonstration. We will download a Bootstrap starter template, customize the theme and content, build a static portfolio page and host it on a UMBC server.

Disclaimer: The presenter is a front end noob and will very likely look like a deer in headlights when confronted with advanced css/js questions.


 

Infoboxer: using statistical semantic knowledge to help create Wikipedia infoboxes

Tim Finin, 7:56pm 29 September 2014


In this week’s ebiquity meeting (10am Tue. Oct 1 in ITE346), Varish Mulwad will present Infoboxer, a prototype tool he developed with Roberto Yus that overcomes these challenges using statistical and semantic knowledge from linked data sources to ease the process of creating Wikipedia infoboxes.

Wikipedia infoboxes serve as input in the creation of knowledge bases
such as DBpedia, Yago, and Freebase. Current creation of Wikipedia
infoboxes is manual and based on templates that are created and
maintained collaboratively. However, these templates pose several
challenges:

  • Different communities use different infobox templates for the same category articles
  • Attribute names differ (e.g., date of birth vs. birthdate)
  • Templates are restricted to a single category, making it harder to find a template for an article that belongs to multiple categories (e.g., actor and politician)
  • Templates are free form in nature and no integrity check is performed on whether the value filled by the user is of appropriate type for the given attribute

Infoboxer creates dynamic and semantic templates by suggesting attributes common for similar articles and controlling the expected values semantically. We will give an overview of our approach and demonstrate how Infoboxer can be used to create infoboxes for new Wikipedia articles as well as update erroneous values in existing infoboxes. We will also discuss our proposed extensions to the project.

Visit http://ebiq.org/p/668 for more information about Infoboxer. A demo can be found here.


 

Rafiki: A Semantic and Collaborative Approach to Community Health-Care in Underserved Areas

Tim Finin, 7:26am 19 September 2014

rafike500

Primal Pappachan, Roberto Yus, Anupam Joshi and Tim Finin, Rafiki: A Semantic and Collaborative Approach to Community Health-Care in Underserved Areas, 10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing, 22-15 October2014, Miami.

Community Health Workers (CHWs) act as liaisons between health-care providers and patients in underserved or un-served areas. However, the lack of information sharing and training support impedes the effectiveness of CHWs and their ability to correctly diagnose patients. In this paper, we propose and describe a system for mobile and wearable computing devices called Rafiki which assists CHWs in decision making and facilitates collaboration among them. Rafiki can infer possible diseases and treatments by representing the diseases, their symptoms, and patient context in OWL ontologies and by reasoning over this model. The use of semantic representation of data makes it easier to share knowledge related to disease, symptom, diagnosis guidelines, and patient demography, between various personnel involved in health-care (e.g., CHWs, patients, health-care providers). We describe the Rafiki system with the help of a motivating community health-care scenario and present an Android prototype for smart phones and Google Glass.


 

Taming Wild Big Data

Tim Finin, 8:36pm 17 September 2014

Jennifer Sleeman and Tim Finin, Taming Wild Big Data, AAAI Fall Symposium on Natural Language Access to Big Data, Nov. 2014.

Wild Big Data is data that is hard to extract, understand, and use due to its heterogeneous nature and volume. It typically comes without a schema, is obtained from multiple sources and provides a challenge for information extraction and integration. We describe a way to subduing Wild Big Data that uses techniques and resources that are popular for processing natural language text. The approach is applicable to data that is presented as a graph of objects and relations between them and to tabular data that can be transformed into such a graph. We start by applying topic models to contextualize the data and then use the results to identify the potential types of the graph’s nodes by mapping them to known types found in large open ontologies such as Freebase, and DBpedia. The results allow us to assemble coarse clusters of objects that can then be used to interpret the link and perform entity disambiguation and record linking.


 

Rapalytics! Where Rap Meets Data Science

Tim Finin, 4:34pm 14 September 2014

UMBC Ebiquity Research Meeting

Rapalytics! Where Rap Meets Data Science

Abhay Kashyap

10:00am Wednesday, Sept. 17, 2014, ITE 346

For the Hip-Hop Fans: Remember the times when you had those long arguments with your friends about who the better rapper is? Remember how it always ended up in a stalemate because there was no evidence to back your argument? Well, look no further! Rapalytics is a one-stop site dedicated to extracting and presenting all the important analytics from Rap lyrics that separate a good rapper from a great one!

For the Data Science Nerds: Remember how indestructible your trained NLP tools were? Want to see how they act under pressure from text they have never seen before? Come take a look at how traditional NLP tools fair against text as complex as Rap and explore opportunities to design and build systems that handle much more than well-formed English text.


 

Kelvin: Extracting Knowledge from Large Text Collections

Tim Finin, 8:59pm 8 September 2014

Preprint: James Mayfield, Paul McNamee, Craig Harman, Tim Finin and Dawn Lawrie, KELVIN: Extracting Knowledge from Large Text Collections, AAAI Fall Symposium on Natural Language Access to Big Data, 2014.

We describe the \kelvin system for extracting entities and relations from large text collections and its use in the TAC Knowledge Base Population Cold Start task run by the U.S. National Institute of Standards and Technology. The Cold Start task starts with an empty knowledge based defined by an ontology or entity types, properties and relations. Evaluations in 2012 and 2013 were done using a collection of text from local Web and news to de-emphasize the linking entities to a background knowledge bases such as Wikipedia. Interesting features of \kelvin include a cross-document entity coreference module based on entity mentions, removal of suspect intra-document conference chains, a slot value consolidator for entities, the application of inference rules to expand the number of asserted facts and a set of analysis and browsing tools supporting development.