UK semantic technology company True Knowledge has released Evi, a mobile app that competes with Siri.
The mobile app is available on the Android Market and on iTunes. You can pose queries to either by speaking or typing. The Android app uses Google’s ASR speech technology and the iTunes app uses Nuance.
True Knowledge has been developing a natural answering question answering system since 2007. You can query the True Knowledge online via a Web interface. Tty the following links for some examples:
The Evi app has a number of additional features beyond the Web-based True Knowledge QA system and these wil probably be expanded on in the months to come.
The Third International Workshop on the role of the Semantic Web in Provenance Management will be held in conjunction with the Ninth Extended Semantic Web Conference (ESWC-2012) on May 27 or 28 in Heraklion, Greece. The workshop’s objectives are to explore opportunities offered by the Semantic Web technologies in the context of the management and exploitation of provenance and document the role of provenance in real-world Semantic Web applications.
The one day workshop will include presentations of full research papers, short position papers, a panel on the W3C provenance working group proposals, and demonstrations of prototypes and working systems. Submit papers and demonstration proposals by 4 March 2012.
WWW, ISWC and WebDB are the top Web conferences based on Microsoft Academic Search citation data.
Last week HCI researcher Antti Oulasvirta has an interesting post on ranking HCI conferences using the average citations per paper based on data from Microsoft Academic Search (MAS). Some of the results surprised him, including that the venerable CHI was not the top conference in this group. His ranking metric for conference significance is essentially the impact factor used for journals, a measure of the average number of citations a paper in a given journal receives in a time period. The IF metric has become widely used in the scholarly journal publication industry since it was defined by Eugene Garfield and first implemented by the company he founded, the Institute for Scientific Information.
Microsoft Academic Search provides citation and publication numbers for conferences in sixteen different subjects domains and a number of sub-domains for each. For computer science, there are 24 sub-domains including one for “World Wide Web” conferences. Following Oulasvirta, we ranked Web technology conferences using the average number of citations received in the last ten years. Starting with 68 Web technology conferences in the MAS collection (not a complete list, btw), I narrowed the set to those that had at least 100 papers in the past ten years and some papers in the past five. This resulted in 26 conferences, eliminating many series that only ran a few times or have stopped. Here are the results.
The results should only be taken as a rough estimate of conference impact. One reason is that IF is only a measure and does not take into account all aspects of scientific importance. For example, as computed here, all citations count equally, including those from high- and low-ranking sources. Another is that while Thompson-Reuters (nee ISI) journal citation data is carefully collected and curated, the Microsoft Academic Search data is the result of a largely automated process that starts with data from Bing. When I tried using the citation information from the past five years, for example, I noted that it reported 23 papers in the past five years for Adaptive Hypermedia and Adaptive Web-Based Systems. This is because the conference merged with User Modeling in 2009 to become User Modeling, Adaptation, and Personalization. Yet another shortcoming is that the MAS list of Web conferences in not complete, for example, omitting the popular ESWC, which has been running since 2004.
“This is free social network and meeting community open to industry, government and academia. The goal of the organizers is to create a vendor neutral environment for open discussion and provide the membership with a valuable resource of information on industry trends and ongoing research.”
All are welcome. If you want to attend, please join the Central MD Semantic Web Meetup group and RSVP. The meeting will start with a pizza social from 6:00pm to 6:45pm and then continue with a series of short presentations of current Semantic Web research being done in our lab.
The Semantic Web provides the technology and knowledge constructs to create a rich notion of context that goes beyond current networking applications focusing mostly on location. The context model includes location and surroundings, the presence of people and devices, inferred activities and the roles people fill in them.
Evidence for a table’s meaning can be found in its metadata but currently requires human interpretation. We describe techniques grounded in graphical models and probabilistic reasoning to infer meaning associated with a table. Using background knowledge from the Linked Open Data cloud, we automatically infer the semantics of column headers, table cell values (e.g., strings and numbers) and relations between columns and represent the inferred meaning as graph of RDF triples.
Users need better ways to explore linked open data collections and obtain information from it. Using SPARQL requires not only mastering its syntax and semantics but also understanding the RDF data model, the ontology used by the DBpedia, and URIs for entities of interest. Natural language question answering systems solve the problem, but these are still subjects of research. We are developing a compromise approach in which non-experts specify a graphical “skeleton” for a query and annotate it with freely chosen words, phrases and entity names. The combination reduces ambiguity and allows us to reliably produce an interpretation that can be translated into SPARQL.
We propose a semantically rich, policy-based framework to automate the lifecycle of cloud services. We have divided the IT service lifecycle into the five phases of requirements, discovery, negotiation, composition, and consumption. We detail each phase and describe the high level ontologies that we have developed to describe them. Our research complements previous work on ontologies for service descriptions in that it goes beyond simple matchmaking and is focused on supporting negotiation for the particulars of IT services.
See this map for the building location and information on visitor parking. The recommended lot is just across from the entrance to UMBC’s campus from I-95. To access it, turn right and then turn left at the first stop sign onto Administration Drive. You can park on the lower level after 3:30pm by putting two quarters into the box at the gate. The upper level has parking meters that take quarters ($1/hr) and a change machine is located near the entrance.
If you are in the DC area this weekend and are interested in using Semantic Web technologies, you should come to the AAAI 2011 Fall Symposium on Open Government Knowledge: AI Opportunities and Challenges. It runs from Friday to Sunday midday at the he Westin Arlington Gateway in Arlington, Virginia.
Join us to meet the thought governmental and business leaders in US open government data activities, and discuss the challenges. The symposium features Friday (Nov 4) as governmental day with speakers on Data.gov, openEi.org, open gov data activities in NIH/NCI and NASA and Saturday (Nov 5) as R&D day with speakers from industry, including Google and Microsoft, as well international researchers.
This symposium will explore how AI technologies such as the Semantic Web, information extraction, statistical analysis and machine learning, can be used to make the valuable knowledge embedded in open government data more explicit, accessible and reusable.
Here’s a word cloud that visualizes the 200 most significant words extracted from over 400 papers from our research group over the past ten years. Significance was estimated by tf-idf where the idf data is from a collection of newswire articles (thanks Paul!). The word cloud was created with Wordle.
“Today, hospitals and doctors use a system of about 18,000 codes to describe medical services in bills they send to insurers. Apparently, that doesn’t allow for quite enough nuance. A new federally mandated version will expand the number to around 140,000—adding codes that describe precisely what bone was broken, or which artery is receiving a stent. It will also have a code for recording that a patient’s injury occurred in a chicken coop.”
We want to see the search engine companies develop and support a Microdata vocabulary for ICD-10. An ICDM-10 OWL DL ontology has already been done, but a Microdata version might add a lot of value. We could use it on our blogs and Facebook posts to catalog those annoying problems we encounter each day, like W59.22XD (Struck by turtle, initial encounter), or Y07.53 (Teacher or instructor, perpetrator of maltreat and neglect).
Humor aside, a description logic representation (e.g., in OWL) makes the coding system seem less ridiculous. Instead of appearing as a catalog of 140K ground tags, it would emphasize that it is a collection of a much smaller number of classes that can be combined in productive ways to produce them or used to create general descriptions (e.g., bitten by an animal).
Many Google+ users have been reporting frequent notices about new followers that they don’t know and appear to be attractive young women. The suspicious followers have minimal profiles and no posts. These are obviously false accounts being created for some yet unknown purpose, but how can one prove it?
I just got a notice, for example, that Janet Smith of Philadelphia is following me. Now Janet Smith is a common name and Philadelphia is a big place — there are probably hundreds of people who live in the Philadelphia area with that name. The 990 other people she’s following seem like a pretty random bunch, though I do know many and have more than a few in my own circles. Most seem to have a fair number of followers.
So there is not much to go on other than her profile image. This is a great use for Google’s new image search. I dragged the picture into the image search query field and Google identified its best guess for the image as Indian actress Koyel Mullick. Sure enough, if you search for images with her name, the precise Janet Smith image is result number 15.
Of course, there are still some subtle issues. This is just one kind of false profile — one created for one identity but using an image from a different one. It’s common on most social media systems, including G+, for some people to use a picture of someone or something other than themselves. But it’s obvious to a human viewer that using a picture of a rabbit, Marilyn Monroe or the mighty Thor on your profile is not meant to deceive. It will be challenging to automate the process of discriminating the intent to deceive from modesty, homage or an ironic gesture.
Special Issue on Evaluation of Semantic Technologies
Journal of Web Semantics
Semantic technologies have become a well-established field of computer science. However, the field is continuously evolving: the number of semantic technologies is constantly increasing, standards evolve and new ones are defined; and, in this scenario, the problem of how to compare and evaluate the various approaches becomes crucial. The consistent evaluation of semantic technologies is critical not only for future scientific progress, by identifying research goals and allowing a rigorous examination of research results, but also for their industrial adoption, by allowing objective measurement and comparison of these technologies and enabling their certification.
Semantic technology evaluation must, on the one hand, be supported by strong methodological approaches and relevant test data and, on the other hand, satisfy the differing needs of developers, researchers and adopters by addressing those quality characteristics that are relevant to each target group. Nevertheless, numerous issues must be faced when evaluating semantic technologies.
On the one hand, because of the fast evolution of the semantic field, previous evaluation methods and techniques need to be adapted and extended and new ones have to be developed. On the other hand, the cost of defining new evaluations methods or reusing existing ones can be prohibitive, so facilitating the understanding of such methods or their automated processing becomes highly significant.
The goal of this special issue is to present current advances and trends in semantic technology evaluation (theories and models, methods and techniques, evaluation campaigns, technology comparison, etc.). Therefore we solicit papers that improve evaluation paradigms of semantic technologies. At the same time papers that evaluate a particular method, technology or system without investigating the evaluation regime itself will be considered out of scope and will be returned to the authors with no review.
Topics of interest
Relevant topics for the special issue include, but are not limited to, the following.
Semantic technology evaluation methods
Test data for semantic technology evaluation
Automation of semantic technology evaluation
Evaluation of semantic technologies in real world scenarios
Evaluation of linked data technologies
Quality requirements for semantic technologies
Semantic technology certification
Maturity models for semantic technologies
Semantic technology selection
Semantic technology quality estimation
Interoperability and conformance of semantic technologies
Semantic technology efficiency and scalability
Usability of semantic technologies
Important dates
We will aim at an efficient publication cycle in order to guarantee prompt availability of the published results. To this end, we encourage submissions well before the submission deadline.
The special issue on semantic sensing will be edited by Harith Alani, Oscar Corcho and Manfred Hauswirth. Papers will be reviewed on a rolling basis and authors are encouraged to submit before the final deadline of 20 December 2011.
The issue on the semantic and social web will be edited by John Breslin and Meena Nagarajan. Papers will be reviewed on a rolling basis and authors are encouraged to submit before the final deadline of 21 January 2012.
The Mid-Atlantic Student Colloquium on Speech, Language and Learning is a one day, free event bringing together faculty, researchers and students from universities in the Mid-Atlantic area working in Speech/Language/ML. The colloquium is an opportunity for students to present preliminary or completed work and to network with other students, faculty and researchers working in related fields. The event will be held in Baltimore MD at the Johns Hopkins University on Friday 23 September 2011.
Students are encouraged to submit one-page abstracts by Monday, August 15 describing ongoing, planned, or completed research projects, including previously published results and negative results. Student research in any field applying computational methods to any aspect of human language, including speech and learning, from all areas of computer science, linguistics, engineering, neuroscience, information science, and related fields, is welcome. Submissions and presentations must be made by students or postdocs. See the call for papers for more information.
Accepted submissions will be presented as posters and each will also be given a one-minute presentation during a poster spotlight session. A small number of submissions will be selected to be presented as talks, on the basis of diversity and general interest.
Student-led breakout sessions of one hour will also be held to discuss papers on topics of interest and stimulate interaction and discussion. Topics and suggested papers for breakout sessions should be submitted by students alongside abstracts.
Yahoo!’s Peter Mika is still a RDFa fan, but also has a pragmatic appreciation for the agreement of the big three search companies on a standard for semantic data.
“Given the above history, I’m extremely glad that cooperation prevailed in the end and hopefully schema.org will become a central point for vocabularies for the Semantic Web for a long time to come. Note that it will almost certainly not be the only one. schema.org covers the core interests of search providers, i.e. the stuff that people search for the most (hence the somewhat awkward term ‘search vocabularies’). As the simple needs are the most common in search logs, this includes things like addresses of businesses, reviews and recipes. schema.org will hopefully evolve with extensions over time but it may never cover complex domains such as biotechnology, e-government or others where people have been using Semantic Web technology with success.”