UMBC ebiquity
Semantic Web

Archive for the 'Semantic Web' Category

AAAI Symposium on Open Government Knowledge deadline extended

June 3rd, 2011, by Tim Finin, posted in Semantic Web

The submission deadline for OGK2011 has been extended to 17 June 2011.

AAAI 2011 Fall Symposium
Open Government Knowledge: AI Opportunities and Challenges

4-6 November 2011 • Arlington, Virginia USA
http://tw.rpi.edu/ogk2011

The 2011 AAAI Fall Symposium on Open Government Knowledge: AI Opportunities and Challenges seeks papers on all aspects of publishing public government data as reusable knowledge on the Web. Both long papers presenting research results and shorter papers describing late breaking work, outlining implemented systems, identifying new research challenges, or articulating a position are invited. Submissions are due by June 17, notifications will be sent by July 15, and the final camera-ready copy must be provided by September 9, 2011.

Microdata chosen over RDFa for semantics by Google, Bing and Yahoo!

June 2nd, 2011, by Tim Finin, posted in RDF, sEARCH, Semantic Web, Web

Google, Bing and Yahoo! are cooperating on an approach to representing structured data in Web pages via the launch of schema.org. The approach is microdata and the schema.org site documents the schemas that are supported today.

“This site provides a collection of schemas, i.e., html tags, that webmasters can use to markup their pages in ways recognized by major search providers. Search engines including Bing, Google and Yahoo! rely on this markup to improve the display of search results, making it easier for people to find the right web pages. Many sites are generated from structured data, which is often stored in databases. When this data is formatted into HTML, it becomes very difficult to recover the original structured data. Many applications, especially search engines, can benefit greatly from direct access to this structured data. On-page markup enables search engines to understand the information on web pages and provide richer search results in order to make it easier for users to find relevant information on the web. Markup can also enable new tools and applications that make use of the structure. A shared markup vocabulary makes easier for webmasters to decide on a markup schema and get the maximum benefit for their efforts. So, in the spirit of sitemaps.org, Bing, Google and Yahoo! have come together to provide a shared collection of schemas that webmasters can use.”

That’s the good news. The bad news, or at least the less good news, is that it based on microdata and not RDFa. Microdata is a relatively new way to embed semantic information in HTML and designed to be part of the HTML5 suite. It is less expressive than RDFa but also simpler. It’s main advantage over microformats is that it is extensible — you can define new semantic vocabulary terms. Here is how the three companies described the choice.

Google: “Historically, we’ve supported three different standards for structured data markup: microdata, microformats, and RDFa. We’ve decided to focus on just one format for schema.org to create a simpler story for webmasters and to improve consistency across search engines relying on the data.”

Yahoo!:“Today’s announcement offers tremendous opportunity for growth. In addition to consolidating the schemas for the vocabularies we already support, there are schemas for more than a hundred newly created categories including movies, music, organizations, TV shows, products, places and more. We will continue to expand these categories by listening to feedback from the community and will continue publishing new schemas on a regular basis. Don’t worry if your site has already added RDFa or microformats currently supported by our Enhanced Displays program, that site will still appear with an Enhanced Display on Yahoo! – no changes required.”

Bing:“At Bing we understand the significant investment required to implement markup, and feel strongly that by partnering with Google and Yahoo! on standard schemas webmasters can be more efficient with the time they invest… Bing accepts a wide variety of markup formats today (Open Graph, microformat, etc.) for features like Tiles and will continue to do so, but by standardizing on schema.org we are looking to simplify the markup choices for webmasters and amplify the value the receive in return.

The scheme.org site has a FAQ that includes the question “Q: Why microdata? Why not RDFa or microformats?” which is answered thusly:

“Focusing on microdata was a pragmatic decision. Supporting multiple syntaxes makes documentation for webmasters more complex and introduces more overhead in terms of defining new formats. Microformats are concise and easy to understand, but they don’t offer an open extensibility mechanism and the reuse of the class tag can cause conflicts with website CSS. RDFa is extensible and very expressive, but the substantial complexity of the language has contributed to slower adoption. Microdata is the most recent well-known standard, created along with HTML5. It strikes a balance between extensibility and simplicity, and is most suitable for building the schema.org. Google and Yahoo! have in the past supported both microformats and RDFa for certain schemas and will continue to support these syntaxes for those schemas. We will also be monitoring the web for RDFa and microformats adoption and if they pick up, we will look into supporting these syntaxes. Also read the section on the data model for more on RDFa.”

Guha has a generous comment in his post on the official Google blog:

“While this collaborative initiative is new, we draw heavily from the decades of work in the database and knowledge representation communities, from projects such as Jim Gray’s SDSS Skyserver, Cyc and from ongoing efforts such as dbpedia.org and linked data. We feel privileged to build upon this great work. We look forward to seeing structured markup continue to grow on the web, powering richer search results and new kinds of applications.”

I’ve not studied microdata yet, so don’t know how I feel about the expressiveness/simplicity tradeoffs it has made. I wonder if it is possible to add an OWL-like layer on top ofMicrodata, for example.

AAAI Symposium on Open Government Knowledge

May 15th, 2011, by Tim Finin, posted in AI, Semantic Web

The 2011 AAAI Fall Symposium on Open Government Knowledge: AI Opportunities and Challenges (OGK2011) seeks papers on all aspects of publishing public government data as reusable knowledge on the Web. Both long papers presenting research results and shorter papers describing late breaking work, outlining implemented systems, identifying new research challenges, or articulating a position are invited. Submissions are due by June 3, notifications will be sent by July 15, and the final camera-ready copy must be provided by September 9 for the November 4-6 workshop.

Relevant topics include the automatic and semi-automatic creation of linked data resources, ontologies for government data, entity linking and co-reference detection between linked data resources, adding temporal qualifications to government data, creating mash-ups with open government data, linked open government data analysis, metadata for provenance, certainty and trust, policies for information sharing, privacy and use, social networks and government data, machine learning applied to government data, data visualization techniques, and applications. The symposium organizers are Li Ding (RPI), Tim Finin (UMBC), Lalana Kagal (MIT) and Deborah McGuinness (RPI). Program committee members and additional information are listed on the OGK2011 symposium site.

New Journal of Web Semantics preprint server

April 12th, 2011, by Tim Finin, posted in AI, Semantic Web

The new Journal of Web Semantics preprint server is now online. Final drafts of accepted papers will be added to the preprint server as papers are accepted for publication, making a preprint available as soon as possible.

We are loading papers from back issues into the preprint server as time permits. The preprint server is based on the Open Journal Systems software and hosted by Gesis, the Leibniz Institute for the Social Sciences.

After drafts are on the preprint server, they enter Elsevier’s production pipeline in which they are professionally copy edited, formatted for the journal, and proofed by the authors. The result is assigned a DOI and put online as a JWS article in press available to to individual and institutional subscribers. When the article is assigned to an issue and printed, the final copy will be available online to subscribers in Elsevier’s Science Direct system.

We would like to thank the people who helped stand up the new preprint server, including Ute Koch of Gesis, Kaixuan Wang of the University of Manchester, and Silke Werger of the University of Koblenz and Landau.

Open Government Knowledge: AI Opportunities and Challenges (OGK2011)

March 29th, 2011, by Tim Finin, posted in AI, Semantic Web, Web

The 2011 AAAI Fall Symposium on Open Government Knowledge: AI Opportunities and Challenges (OGK2011) seeks papers on all aspects of publishing public government data as reusable knowledge on the Web. Both long papers presenting research results and shorter papers describing late breaking work, outlining implemented systems, identifying new research challenges, or articulating a position are invited. Submissions are due by June 3, notifications will be sent by July 15, and the final camera-ready copy must be provided by September 9.

Websites like data.gov, research.gov and USASpending.gov aim to improve government transparency, increase accountability, and encourage public participation by publishing public government data online. Although this data has been used for some intriguing applications, it is difficult for citizens to understand and use. This symposium will explore how AI technologies such as the Semantic Web, information extraction, statistical analysis and machine learning can be used to make the knowledge embedded in the data more explicit, accessible and reusable. The symposium’s location of Washington, DC will facilitate the participation of U.S. federal government agency members and enable interchange between researchers and practitioners. We also expect attendance of international open government data players from e.g. UK and Australia.

Relevant topics include the automatic and semi-automatic creation of linked data resources, ontologies for government data, entity linking and co-reference detection between linked data resources, adding temporal qualifications to government data, creating mash-ups with open government data, linked open government data analysis, metadata for provenance, certainty and trust, policies for information sharing, privacy and use, social networks and government data, machine learning applied to government data, data visualization techniques, and applications.

This symposium will include a mix of invited talks, paper presentations, panels, system demonstrations, a poster session, and discussions. We plan to have several invited speakers drawn from government, academia and industry. We will run panels on the emerging challenges and best practices, including (i) how to enhance transparency and interoperability within an agency and across different agencies/countries, and (ii) how to promote nationwide health information network that effectively integrates government-curated public records and citizens’ personal health data.

The symposium organizers are Li Ding (RPI), Tim Finin (UMBC), Lalana Kagal (MIT) and Deborah McGuinness (RPI). Program committee members and additional information are listed on the OGK2011 symposium site. For more information about the the symposium, send email inquiries to ogk11-info@googlegroups.com.

Important Dates

  • Workshop: 4-6 November 2011 in Arlington, Virginia USA
  • Submissions due: 3 June 2011
  • Decisions by: 15 July 15 2011
  • Camera ready by: 9 September 2011

AAAI-11 Workshop on Activity Context Representation: Techniques and Languages

March 14th, 2011, by Tim Finin, posted in Agents, AI, KR, Mobile Computing, Pervasive Computing, Semantic Web

Mobile devices and provide better services if then can model, recognize and adapt to their users' context.

Pervasive, context-aware computing technologies can significantly enhance and improve the coming generation of devices and applications for consumer electronics as well as devices for work places, schools and hospitals. Context-aware cognitive support requires activity and context information to be captured, reasoned with and shared across devices — efficiently, securely, adhering to privacy policies, and with multidevice interoperability.

The AAAI-11 conference will host a two-day workshop on Activity Context Representation: Techniques and Languages focused on techniques and systems to allow mobile devices model and recognize the activities and context of people and groups and then exploit those models to provide better services. The workshop will be held on August 7th and 8th in San Francisco as part of AAAI-11, the Twenty-Fifth Conference on Artificial Intelligence. Submission of research papers and position statements are due by 22 April 2011.

The workshop intends to lay the groundwork for techniques to represent context within activity models using a synthesis of HCI/CSCW and AI approaches to reduce demands on people, such as the cognitive load inherent in activity/context switching, and enhancing human and device performance. It will explore activity and context modeling issues of capture, representation, standardization and interoperability for creating context-aware and activity-based assistive cognition tools with topics including, but not limited to the following:

  • Activity modeling, representation, detection
  • Context representation within activities
  • Semantic activity reasoning, search
  • Security and privacy
  • Information integration from multiple sources, ontologies
  • Context capture

There are three intended end results of the workshop: (1) Develop two-three key themes for research with specific opportunities for collaborative work. (2) Create a core research group forming an international academic and industrial consortium to significantly augment existing standards/drafts/proposals and create fresh initiatives to enable capture, transfer, and recall of activity context across multiple devices and platforms used by people individually and collectively. (3) Review and revise an initial draft of structure of an activity context exchange language (ACEL) including identification of use cases, domain-specific instantiations needed, and drafts of initial reasoning schemes and algorithms.

For more information, see the workshop call for papers.

Journal of Web Semantics special issues on context and mobility

March 6th, 2011, by Tim Finin, posted in Mobile Computing, Semantic Web

The Journal of Web Semantics has announced two new special issues to be published in 2010.

An issue on Reasoning with context in the Semantic Web seeks papers by June 15, 2011 and will be published in the Spring of 2012. The special issue will be edited by Alan Bundy and Jos Lehmann of the University of Edinburgh and Ivan Varzinczak of the Meraka Institute.

An issue on The Semantic Web in a Mobile World will accept submission until October 1, 2011 and will be published in September 2012. The special issue will be edited by Ansgar Scherp of the University of Koblenz-Landau and Anupam Joshi of the University of Maryland, Baltimore County.

Free linked data book by Tom Heath and Chris Bizer

March 2nd, 2011, by Tim Finin, posted in AI, Semantic Web, Web

Congratulations to Tom Heath and Christian Bizer on the publication of their new book, Linked Data: Evolving the Web into a Global Data Space. It’s published by Morgan & Claypool in the series Synthesis Lectures on the Semantic Web: Theory and Technology edited by Jim Hendler and Frank van Harmelen.
Linked Data: Evolving the Web into a Global Data Space

“This book provides a conceptual and technical introduction to the field of Linked Data. It is intended for anyone who cares about data – using it, managing it, sharing it, interacting with it – and is passionate about the Web. We think this will include data geeks, managers and owners of data sets, system implementors and Web developers. We hope that students and teachers of information management and computer science will find the book a suitable reference point for courses that explore topics in Web development and data management. Established practitioners of Linked Data will find in this book a distillation of much of their knowledge and experience, and a reference work that can bring this to all those who follow in their footsteps.”

More importantly, we should all thank them and Morgan & Claypool for making a free HTML version available on the Web.

Google recipe search exploits semantic web data in RDFa

February 26th, 2011, by Tim Finin, posted in AI, Google, Semantic Web

Many people now use the Web to find recipes rather than their own collection of cookbooks and it is estimated that about one percent of all Google searches are for recipes. This past Thursday, Google released Recipe View in the US, letting you limit results to pages that are recipes and further narrow your search by ingredients, cooking time and calories. This feature is powered by semantic metadata encoded in RDFa and other formats

Google recipe search exploits semantic data in RDFa

Google describes the new recipe search in a post on the Official Google Blog:

“Recipe View lets you narrow your search results to show only recipes, and helps you choose the right recipe amongst the search results by showing clearly marked ratings, ingredients and pictures. To get to Recipe View, click on the “Recipes” link in the left-hand panel when searching for a recipe. You can search for specific recipes like [chocolate chip cookies], or more open-ended topics—like [strawberry] to find recipes that feature strawberries, or even a holiday or event, like [cinco de mayo]. In fact, you can try searching for all kinds of things and still find interesting results: a favorite chef like [ina garten], something very specific like [spicy vegetarian curry with coconut and tofu] or even something obscure like [strange salad].”

Recipe View extracts data embedded in Web pages that is encoded in Google’s rich snippets format. This includes both the W3C Semantic Web standard RDFa as well as microformats. Google recognizes a simple recipe vocabulary with fourteen properties.

This is a great example of the potential of semantic web technology that can be understood and appreciated by anyone with an interest in cooking. Or eating.

ICWSM 2011 Data Challenge with 3TB of social media data

February 23rd, 2011, by Tim Finin, posted in Datamining, NLP, Semantic Web, Social media

The Fifth International AAAI Conference on Weblogs and Social Media is holding a new data challenge using a new dataset from that includes about three TB of social media data collected by Spinn3r between January 13 and February 14th, 2011.

The dataset consists of over 386M blog posts, news articles, classifieds, forum posts and social media content in a month including events such as the Tunisian revolution and the Egyptian protests. The content includes the syndicated text, its original HTML as found on the web, annotations and metadata (e.g., author information, time of publication and source URL), and boilerplate/chrome extracted content. The data is formatted as Spinn3r’s protostreams – an extension to Google protobuffers. It is also broken down by date, content type and language making it easy to work with selected data.

See the ICWSM Data Challenge pages for more information on the challenge task, its associated ICWSM workshop and procedures for data access.

Did Watson enjoy a head start on Jeopardy?

February 22nd, 2011, by Tim Finin, posted in AI, Machine Learning, Semantic Web

IBM's Watson on Jeopardy!

IBM’s Watson’s performance in last week’s Jeopardy Challenge was an amazing accomplishment and a demonstration of how our computer systems are becoming more intelligent and capable of solving difficult tasks.

But I wonder if the way that questions were given to the human players and Watson doesn’t give Watson a short, but significant head start. According to the New York Times

“During the sparring matches, Watson received the questions as electronic texts at the same moment they were made visible to the human players;”

Once Watson received a query, it could process it immediately. While the human contestants got to see the query as written text at the same time, Alex Trebek also starts reading the question aloud. When I was watching Jeopardy, I found it almost impossible to read and understand the question more quickly than it was being spoken and suspect that Ken Jennings and Brad Rutter might also. It’s often observed that people find it very difficult to simultaneously process two language streams. While it took Trebek only a second or two to read the short Jeopardy queries, that could have given Watson a significant head start, enabling it to determine that it had a good answer and press its buzzer before the competition.

If this is the case, I am not sure if it is an unfair advantage. People and computers each have native advantages and disadvantages. If Jennings and Rutter got the questions as text without them being simultaneous read aloud, Watson might still have had the advantage of a quicker start.

Computer Science publication culture

February 14th, 2011, by Tim Finin, posted in Computing Research, CS, Semantic Web

There has been an ongoing discussion on the publication culture with the computer science research community in CACM, carried out through a series of editorials, opinion pieces, articles and letters. It covers the usual topics — the best role of workshops, conferences and journals, reviewer responsibility, the effect of deadlines on publications, etc. All important issues.

Jonathan Grudin has an opinion piece in the current (Feb) CACM

Technology, conferences, and community. J. Grudin, 2011. Comm. of the ACM, 54, 2, 41-43.

He has also made available a list of the 16 recent CACM articles (with links) on the topic. It’s a list of papers worth reading.

You are currently browsing the archives for the Semantic Web category.

  Home | Archive | Login | Feed