Robot discovers ancient graffiti

June 12th, 2011

Back in May, it was reported that a robot explorer sent through the Great Pyramid of Giza discovered mysterious hieroglyphs in the 4,500-year-old mausoleum behind one of its mysterious doors. The images transmitted by the robot showed hieroglyphs written in red paint that had not been seen by human eyes since the construction of the pyramid.

This week, the reports are that the three red ochre figures painted on the floor of a hidden chamber at the end of a tunnel deep inside the pyramid are just numbers. The builders of the pyramid simply recorded the total length of the southern shaft from the Queen’s Chamber: 121 cubits.

While not exactly graffiti, it reminds me that when I’ve worked on an older house, I’ve often found notes left by the original workers who built it, like sketches with dimensions on the plaster covered up by wallpaper.

Microdata and RDFa

June 3rd, 2011

The Semantic Web community is still unsure what to think of the microdata.

The provides static RDFS documents of the terms in RDF serialized in turtle, XML and ntriples as well as in JSON.

Mike Bergman argues that the microdata effort will also boost RDF.

Yahoo!’s Peter Mika is still a RDFa fan, but also has a pragmatic appreciation for the agreement of the big three search companies on a standard for semantic data.

“Given the above history, I’m extremely glad that cooperation prevailed in the end and hopefully will become a central point for vocabularies for the Semantic Web for a long time to come. Note that it will almost certainly not be the only one. covers the core interests of search providers, i.e. the stuff that people search for the most (hence the somewhat awkward term ‘search vocabularies’). As the simple needs are the most common in search logs, this includes things like addresses of businesses, reviews and recipes. will hopefully evolve with extensions over time but it may never cover complex domains such as biotechnology, e-government or others where people have been using Semantic Web technology with success.”

AAAI Symposium on Open Government Knowledge deadline extended

June 3rd, 2011

The submission deadline for OGK2011 has been extended to 17 June 2011.

AAAI 2011 Fall Symposium
Open Government Knowledge: AI Opportunities and Challenges

4-6 November 2011 • Arlington, Virginia USA

The 2011 AAAI Fall Symposium on Open Government Knowledge: AI Opportunities and Challenges seeks papers on all aspects of publishing public government data as reusable knowledge on the Web. Both long papers presenting research results and shorter papers describing late breaking work, outlining implemented systems, identifying new research challenges, or articulating a position are invited. Submissions are due by June 17, notifications will be sent by July 15, and the final camera-ready copy must be provided by September 9, 2011.

Microdata chosen over RDFa for semantics by Google, Bing and Yahoo!

June 2nd, 2011

Google, Bing and Yahoo! are cooperating on an approach to representing structured data in Web pages via the launch of The approach is microdata and the site documents the schemas that are supported today.

“This site provides a collection of schemas, i.e., html tags, that webmasters can use to markup their pages in ways recognized by major search providers. Search engines including Bing, Google and Yahoo! rely on this markup to improve the display of search results, making it easier for people to find the right web pages. Many sites are generated from structured data, which is often stored in databases. When this data is formatted into HTML, it becomes very difficult to recover the original structured data. Many applications, especially search engines, can benefit greatly from direct access to this structured data. On-page markup enables search engines to understand the information on web pages and provide richer search results in order to make it easier for users to find relevant information on the web. Markup can also enable new tools and applications that make use of the structure. A shared markup vocabulary makes easier for webmasters to decide on a markup schema and get the maximum benefit for their efforts. So, in the spirit of, Bing, Google and Yahoo! have come together to provide a shared collection of schemas that webmasters can use.”

That’s the good news. The bad news, or at least the less good news, is that it based on microdata and not RDFa. Microdata is a relatively new way to embed semantic information in HTML and designed to be part of the HTML5 suite. It is less expressive than RDFa but also simpler. It’s main advantage over microformats is that it is extensible — you can define new semantic vocabulary terms. Here is how the three companies described the choice.

Google: “Historically, we’ve supported three different standards for structured data markup: microdata, microformats, and RDFa. We’ve decided to focus on just one format for to create a simpler story for webmasters and to improve consistency across search engines relying on the data.”

Yahoo!:“Today’s announcement offers tremendous opportunity for growth. In addition to consolidating the schemas for the vocabularies we already support, there are schemas for more than a hundred newly created categories including movies, music, organizations, TV shows, products, places and more. We will continue to expand these categories by listening to feedback from the community and will continue publishing new schemas on a regular basis. Don’t worry if your site has already added RDFa or microformats currently supported by our Enhanced Displays program, that site will still appear with an Enhanced Display on Yahoo! – no changes required.”

Bing:“At Bing we understand the significant investment required to implement markup, and feel strongly that by partnering with Google and Yahoo! on standard schemas webmasters can be more efficient with the time they invest… Bing accepts a wide variety of markup formats today (Open Graph, microformat, etc.) for features like Tiles and will continue to do so, but by standardizing on we are looking to simplify the markup choices for webmasters and amplify the value the receive in return.

The site has a FAQ that includes the question “Q: Why microdata? Why not RDFa or microformats?” which is answered thusly:

“Focusing on microdata was a pragmatic decision. Supporting multiple syntaxes makes documentation for webmasters more complex and introduces more overhead in terms of defining new formats. Microformats are concise and easy to understand, but they don’t offer an open extensibility mechanism and the reuse of the class tag can cause conflicts with website CSS. RDFa is extensible and very expressive, but the substantial complexity of the language has contributed to slower adoption. Microdata is the most recent well-known standard, created along with HTML5. It strikes a balance between extensibility and simplicity, and is most suitable for building the Google and Yahoo! have in the past supported both microformats and RDFa for certain schemas and will continue to support these syntaxes for those schemas. We will also be monitoring the web for RDFa and microformats adoption and if they pick up, we will look into supporting these syntaxes. Also read the section on the data model for more on RDFa.”

Guha has a generous comment in his post on the official Google blog:

“While this collaborative initiative is new, we draw heavily from the decades of work in the database and knowledge representation communities, from projects such as Jim Gray’s SDSS Skyserver, Cyc and from ongoing efforts such as and linked data. We feel privileged to build upon this great work. We look forward to seeing structured markup continue to grow on the web, powering richer search results and new kinds of applications.”

I’ve not studied microdata yet, so don’t know how I feel about the expressiveness/simplicity tradeoffs it has made. I wonder if it is possible to add an OWL-like layer on top ofMicrodata, for example.