Google and the Semantic Web

July 23rd, 2006

Tim Berners Lee’s keynote talk at AAAI last Tuesday generated a lot of interest in the Semantic Web. We had many people visit our demonstrations of several Semantic Web related projects, including Spire, Swoogle and Semnews the next day to find out more. Based on my conversations with people at the demonstrations and more generally at the conference, I am surprised at how many of the AAAI attendees knew relatively little about the Semantic Web.

Of course, many articles seized on the questions that Peter Norvig rose at the end of Tim’s talk. Editors write headlines, not reporters, and many tried to frame the stories as Google challenges Web inventor. The funniest post I saw referred to Peter as a “Google suit”. While he is a senior executive at Google, has anyone ever seen him wearing a suit? His normal dress is a Hawaiian shirt, which is what he wore at AAAI.

Peter’s questions to Tim were reasonable ones from the perspective of a company with an established Web business. They are also easily answered. Unfortunately, there wasn’t enough time after the keynote talk to any discussion. My own answers to Peter’s three questions would have been along these lines.

  1. Yes, the technologies needed to support the Semantic Web are complex and new to most of us. Some aspects (e.g., parts of OWL) may be too much for the near term. However, most of the technology is no more complex than that which supports the current web, e.g., relational databases, Web servers with php, servlets, etc., web clients with applets and javascript, etc. As the software matures and people become more familiar with the systems, it will be easily managed.
  2. There is always a struggle to overcome proprietary resistance and get new standards adopted. I rather liked Tim’s answer to this — that this resistance will erode bit by bit as one competitor after another give up pieces of their own proprietary stance. He gave some good examples from the evolution of data sharing on the Web in the 90s.
  3. Google already uses largely automated techniques to identify and deal with Web spam, email spam in gmail, click fraud, etc. We won’t begin by using completely automated techniques to process and make decisions based on data found on the Semantic Web and will be able to develop partly automated systems to decide what data can and should be trusted and by how much.

Each of these areas requires research and exploration and it is going on in the Semantic Web community and, I suspect, within Google, in one form or another.