Entity Disambiguation in Google Auto-complete

September 23rd, 2012

Google has added an “entity disambiguation” feature along with auto-complete when you type in your search query. For example, when I search for George Bush, I get the following additional information in auto-complete.

As you can see, Google is able to identify that there are two George Bushes’ — the 41st and the 43rd President and accordingly makes a suggestion to the user to select the appropriate president. Similarly, if you search for Johns Hopkins, you get suggestions for John Hopkins – the University, the Entrepreneur and the Hospital.  In the case of the Hopkins query, its the same entity name but with different types and thus Google appends different entity types along with the entity name.

However, searching for Michael Jordan produces no entity disambiguation. If you are looking for Michael Jordan, the UC Berkeley professor, you will have to search for “Michael I Jordan“. Other examples that Google is not handling right now include queries such as apple — {fruit, company}, jaguar {animal, car}.  It seems to me that Google is only including disambiguation between popular entities in its auto-complete. While there are six different George Bushes’ and ten different Michael Jordans‘ on Wikipedia, Google includes only two and none respectively when it disambiguates George Bush and Michael Jordan.

Google talked about using its knowledge graph to produce this information.  One can envision the knowledge graph maintaining, a unique identity for each entity in its collection, which will allow it to disambiguate entities with similar names (in the Semantic Web world, we call it as assigning a unique uri to each unique thing or entity). With the Hopkins query, we can also see that the knowledge graph is maintaining entity type information along with each entity (e.g. Person, City, University, Sports Team etc).  While folks at Google have tried to steer clear of the Semantic Web, one can draw parallels between the underlying principles on the Semantic Web and the ones used in constructing the Google knowledge graph.

SMOOTH: an efficient method for probabilistic knowledge integration

October 12th, 2008

In this week’s ebiquity meeting (10:30am Tue Oct 14), PhD student Shenyong Zhang will present his recent work with Yun Peng on SMOOTY, a new efficient method for modifying a joint probability distribution to satisfy a set of inconsistent constraints. It extends the well-known “iterative proportional fitting procedure” (IPFP) which only works with consistent constraints. Compared to existing methods, SMOOTH is computationally more efficient and insensitive to data. Moreover, SMOOTH can be easily integrated with Bayesian networks for Bayesian reasoning with inconsistent constraints. A paper on this work, An Efficient Method for Probabilistic Knowledge Integration will apear in the proceedings of The 20th IEEE International Conference on Tools with Artificial Intelligence next month.