As a member of the Powerlabs community (sign up here if you haven’t already), last night I had the privilege of being part of an selected group of audience who were shown an exclusive preview of Powerset’s natural language search engine.
What they have achieved is truly amazing! With some of the parsing and Natural Language Processing (NLP) technology licensed from PARC, Powerset has the ability to semantically process not only the queries but also entire documents (and in fact the whole Web). What this means is: unlike statistical approaches to building search indices, Powerset believes that computational linguistics and NLP can add richer semantics to the text. By treating words purely as literals most search engines cannot “understand” it’s “meaning”. In contrast Powerset’s approach is to figure out the Part of speech, Named Entities, Relations and add “facts” to a repository of knowledge by mapping words to their meaning in an ontology (using Freebase, wordnet and other knowledge resources). Consider the following queries that were demonstrated:
– which companies were acquired by Peoplesoft
– Which company acquired Peoplesoft?
– Acquisitions in 2001
Note that queries don’t have to be purely question based: this is a confusion that most people seem to have about NLP-based search vs. Question Answering. Traditional search engines would really have a hard time disambiguating the semantics of such queries, since they ignore terms like “in”, “by” etc. Search engines would also cannot map words like “bought over”, “taken over”, etc to its conceptual meaning of “being acquired”. Not to mention the difference between “being acquired” vs. “acquiring”. Based on our group’s experience with language technology related research and SemNews, we can really appreciate the complexity and significance of these tasks – it’s a really hard problem and Powerset is really trying to ride the Moore’s law to make it feasible to semantically annotate and index large collections.
Powerset is also introducing Powerlabs where its community of users can provide feedback, ideas and actually participate (Digg style) in the product’s development (and earn points to become experts on a topic). This would be a fresh change from the typical launch strategies of most startups these days. Surely will also be APIs, widgets and new mashups that are going to come out of this.
Overall, I was pretty impressed and think that if Powerset gets it right this would be a huge leap in terms of search. However, one challenge would be to retrain users on how they think about queries. Powerset is trying to overcome this with new interfaces for search engines and interestingly will let the user community on Powerlabs decide what they like and dislike – a Social Media approach to product launch!