UMBC eBiquity Blog
Tim Finin, 11:18pm 26 February 2014
Google is offering a free, online MOOC style course on ‘Making Sense of Data‘ from March 18 to April 4 taught by Amit Deutsch (Google) and Joe Hellerstein (Berkeley).
Interestingly, it doesn’t require programming or database skills: “Basic familiarity with spreadsheets and comfort using a web browser is recommended. Knowledge of statistics and experience with programming are not required.” The course will use Google’s Fusion Tables service for managing and visualizing data
Tim Finin, 1:12pm 26 February 2014
The next Central MD Semantic Web Meetup will be held at 6:00pm on Thursday, February 27, 2014 at Inovex Information Systems (7240 Parkway Dr., Suite 140, Hanover MD). Michael Grove, the Chief Software Architect at Clark & Parsia, will talk on their Stardog triple store technology. The meetup is a good way to meet and network with others working on or with semantic technologies in Maryland.
Our speaker, Michael Grove, is the Chief Software Architect at Clark & Parsia, where he also serves as the lead developer of Stardog, the leader in RDF databases featuring fast query performance and unmatched OWL & SWRL support.
A graduate in Computer Science at the University of Maryland, College Park, Michael first got started with semantic technologies in 2002 as a research assistant under Dr. Jim Hendler at the University of Maryland with the MINDSWAP group. Before joining the team at Clark & Parsia, he worked at Fujitsu Research Labs as the lead developer for the Task Computing project, an effort bring the semantic web to pervasive computing environments.
Michael is also active in open source where he is a contributor to Pellet the leading OWL DL reasoner and maintains Empire, an implementation of JPA backed by semantic technologies. Additionally, he is contributor to the Sesame project and active on the Jena development list.”
Tim Finin, 11:53am 8 February 2014
In the first Ebiquity meeting of the semester, Vlad Korolev will talk about his work on using RDF for to capture, represent and use provenance information for big data experiments.
PROB: A tool for Tracking Provenance and Reproducibility of Big Data Experiments
10-11:30am, ITE346, UMBC
Reproducibility of computations and data provenance are very important goals to achieve in order to improve the quality of one’s research. Unfortunately, despite some efforts made in the past, it is still very hard to reproduce computational experiments with high degree of certainty. The Big Data phenomenon in recent years makes this goal even harder to achieve. In this work, we propose a tool that aids researchers to improve reproducibility of their experiments through automated keeping of provenance records.
Tim Finin, 10:34am 30 January 2014
Today’s online meeting (Jan 30, 12:30-2:30 EST) in the 2014 Ontology Summit series is part of the Tools, Services, and Techniques track and features presentations by
- Dr. ChrisWelty (IBM Research) on “Inside the Mind of Watson – a Natural Language Question Answering Service Powered by the Web of Data and Ontologies”
- Prof. AlanRector (U. Manchester) on “Axioms & Templates: Distinctions and Transformations amongst Ontologies, Frames, & Information Models
- Professor TillMossakowski (U. Magdeburg) on “Challenges in Scaling Tools for Ontologies to the Semantic Web: Experiences with Hets and OntoHub”
Audio via phone (206-402-0100) or Skype. See the session page for details and access to slides.
Tim Finin, 8:48am 23 January 2014
The first online session of the 2014 Ontology Summit on “Big Data and Semantic Web Meet Applied Ontology” takes place today (Thurday January 23) from 12:30pm to 2:30pm (EST, UTC-5) with topic Common Reusable Semantic Content — The Problems and Efforts to Address Them. The session will include four presentations:
followed by discussion.
Audio connection is via phone (206-402-0100, 141184#) or Skype with a shared screen and participant chatroom. See the session page for more details.
Tim Finin, 1:56am 18 January 2014
A free PDF version of the new second edition of Mining of Massive Datasets by Anand Rajaraman, Jure Leskovec and Jeffey Ullman is available. New chapters on mining large graphs, dimensionality reduction, and machine learning have been added. Related material from Professor Leskovec’s recent Stanford course on Mining Massive Data Sets is also available.
Tim Finin, 10:30am 14 January 2014
The ninth Ontology Summit starts on Thursday, January 16 with the theme “Big Data and Semantic Web Meet Applied Ontology.” The event kicks off a three month series of weekly online meetings on Thursdays that feature presentations from expert panels and discussions with all of the participants. The series will culminate with a two day symposium on April 28-29 in Arlington VA. The sessions are free and open to all, including researchers, practitioners and students.
The first virtual meeting will be held 12:30-
2:00 2:30 (EST) on Thursday, January 16 and will introduce the nine different topical tracks in the series, their goals and organizers. Audio connection is via phone (206-402-0100, 141184#) or Skype with a shared screen and participant chatroom. See the session page for more details.
This year’s Ontology Summit is an opportunity for building bridges between the Semantic Web, Linked Data, Big Data, and Applied Ontology communities. On the one hand, the Semantic Web, Linked Data, and Big Data communities can bring a wide array of real problems (such as performance and scalability challenges and the variety problem in Big Data) and technologies (automated reasoning tools) that can make use of ontologies. On the other hand, the Applied Ontology community can bring a large body of common reusable content (ontologies) and ontological analysis techniques. Identifying and overcoming ontology engineering bottlenecks is critical for all communities.
The 2014 Ontology Summit is chaired by Michael Gruninger and Leo Obrst.
Tim Finin, 5:24pm 9 January 2014
Computer Science and Electrical Engineering
University of Maryland, Baltimore County
Ph.D. Dissertation Proposal
Functional Reference Ontology Development:
a Design Pattern Approach
1:00pm Friday, January 10, 2014, ITE325b, UMBC
The next generation of smart manufacturing systems will be developed by composing advanced manufacturing components and IT services introducing new technologies. These new technologies can lead to dramatic improvements in the ability to monitor, control, and optimize all aspects of manufacturing. The ability to compose advanced manufacturing components and IT services enhances agility, resiliency, and productivity of a manufacturing system. In order to make the composition possible, functional knowledge of manufacturing components and IT services should be captured and shared explicitly. Recent researches have shown that a semantically precise and rich reference functional ontology enables effective composition. However, since domains of factories and production networks are large, evolving, and heterogeneous, developing a reference functional ontology is a challenging task. Specifically, conceptual functionality modeling that characterizes various features of manufacturing components and IT services at different levels of abstraction is a difficult task. Even if the reference functional ontology is developed successfully, there will certainly be interoperability issues between the reference functional ontology and local proprietary information models. Firstly, the conceptual conflict issues may arise primarily from the fact that the reference functional ontology does not reflect actual users’ or providers’ conceptualizations. Secondly, structural conflict issues may arise from diverse modeling choices in local, proprietary information models.
The objective of our research is to assess utility of design patterns in addressing the issues in the reference functional ontology development, specifically OWL ontology design patterns (ODPs). To achieve the objective, we will assess inductive approaches to identifying the ODPs, and explore development of a methodology for resolving structural differences between the reference functional ontology and local proprietary information models. The key potential contributions of this work include 1) new method to identify information patterns of functionalities in manufacturing components and IT services, 2) new inductive ODP development process which starts with the pattern definition of the specific functionality concepts, with subsequent grouping of these patterns into more general patterns, and 3) ODP-based ontology transformation to resolve structural conflicts between the reference functional ontology and local proprietary information models.
Committee: Drs. Yun Peng (chair), Tim Finin, Yelena Yesha, Milton Halem, Nenad Ivezic (NIST) and Boonserm Kulvatunyou (NIST)
Tim Finin, 11:43am 1 January 2014
“The app uses browser’s geolocation feature to find user’s location and displays a map of interesting objects that can be found nearby (within 50 000 ft). It uses the Freebase Search API to find relevant objects. When user clicks on one of the markers, the app calls the Freebase Topic API to fetch more information about that object. Once the information is retrieved, it populates a purejs template to display a knowledge card for the user.”
This sort of application has been done many times before with RDF and the Google approach can be adapted to query an arbitrary RDF resource for custom knowledge bases.
Tim Finin, 3:07pm 27 December 2013
At the ISWC Privon workshop in October, Neel Guha talked about his Spy Watch Google Chrome extension that keeps track of the third parties tracking the web pages you visit. Unlike Ghostery, it only collects information and can not block tracking sites, but it logs more information about how your Web behavior is being observed and gives good insight into the nature and scope of the Web tracking phenomenon.
When you view a page like www.nytimes.com you expect it to know that you visited the site. It may even know personal information (e.g., name, address, age, sex) if ever divulged it to the site, perhaps when setting up an account. Spy Watch reports that my recent visit to the NYT site was also observed by 24 other sites, including doubleclick.com, brightcove.com, googleapis.com and sothebysrealty.com. And this is with an ad blocker enabled — 28 third parties observed me when I disable it.
Each of these third parties also knows the page on the NYT site I just visited. But I don’t have an account on most of them, so they don’t know who I really am, right? Well, some can easily discover my identity. Doubleclick, for example, knows I just read that Times article on how to cook a duck and, since it’s part of Google, can potentially integrate the information with all of the other information Google has about me.
I’ve been running Spy Watch for about two months and it reports that 1533 third party sites have (potentially) collected data about the 12,000 distinct URLs I’ve visited during this time. It also notes that, on average, every page I’ve visit has been watched by 3.7 third parties. As you might expect, the distribution follows a power law with a long tail of sites that only observed a few of my visits (about 2/3 of them saw three or fewer). Here are the top twenty third party trackers in my two month’s of data.
Note that Google (red), Facebook (dark blue) and Twitter (green) are the three companies who potentially know the most about what you do on the Web.
Spy Watch can also show how many and which pages have been observed by a tracker. Facebook observed me viewing 2208 pages across 509 sites (via FB like and visit buttons) and now knows that I read reviews for Sharp and LG microwave ovens on toptenreviews.com earlier this month and frequently visit the cra.org site.
You can get and install Spy Watch from the Google Web store, which describes it like this.
Spy Watch is a privacy extension that aims to create transparency in online internet tracking by third party sites. When a user visits a page, Spy Watch lets the user see every site that knows the user visited that page. And for each of these sites, the user can find out what other information the site has gathered about the user’s browsing history. After you install the extension, continue to browse normally. After some time, click on the extension to see who’s watching you! Disclaimer: User data is stored in the browser and is not accessible by the creator of this extension.
Tim Finin, 9:14am 4 December 2013
A post on Google’s research blog lists the major datasets for NLP and KB processing that Google has released in the past year. They include datasets to help in entity linking, relation extraction, concept spotting and syntactic analysis. Subscribe to the the Knowledge Data Releases mailing list for updates.
Tim Finin, 7:51am 16 October 2013
The UMBC Computer Science and Electrical Engineering department is searching for new full-time faculty: two in Computer Science, one in Electrical and Computer Engineering, one Computer Science professor of the practice, and one Computer Science/Information Systems lecturer. See the CSEE Jobs page for detailed information on the positions, preferred specializations and the application process.