AskWiki uses Wikipedia for semantic search

Tim Finin, 2:36pm 3 November 2007

AskWiki is a service that answers simple questions with passages from Wikipedia. It is the product of a collaboration between the Wikimedia Foundation and AskMeNow, a company that uses NLP technology to answer simple questions sent via mobile phones. AskWiki classifies what it is doing as a kind of ‘semantic search, which it defines (citing Wikipedia) as:

“Semantic Search attempts to augment and improve traditional Research Searches by leveraging XML data from semantic networks to disambiguate semantic search queries and web text in order to increase relevancy of results.” (link)

After playing with AskWiki a bit I was impressed with many of its answers and disappointed with others. About par for the course with this kind of system. But probing AskWiki’s performance and trying to reverse engineer what they are doing is a lot of fun. Mostly the system appears to work by selecting one sentence-level passage from Wikipedia that is the best match as an answer to the question. The question and sentences are probably pre-processed in various ways to improve the process. The match is probably based on the relevance as determined by the underlying IR engine.

What makes AskWiki perform so well, I’d guess, is the breadth, style and quality of Wikipedia’s content.

For more information, see AskWiki: Wikipedia Knows What You’re Thinking or, better yet, sit down with AskWiki and talk. Here’s a transcript of my session with it along with some comments.

Q: how high is the empire state building?
A: Antenna spire: 449 m (1,472 ft)
From: Empire State Building
Comment: This answer was extracted from the Infobox.

Q: empire state building height
A: The Empire State Building rises to 381 m (1,250 feet) at the 102nd floor, and including the 203ft pinnacle its full height reaches 443 m (1,453 ft and 8 9/16th in).
From: Empire State Building
Comment: This sentence is from the article’s text, and seems better than the first one.

Q: How heavy is a bowling ball?
A: Most people can easily buy a bowling ball that is at least 1 to 2 pounds heavier than they used as a house ball.
From: Bowling ball
Comment: Nice try, but its a spare.

Q: How long should I cook a hard boiled egg?
A: Studies done at the American Egg Board substantiate that unopened containers of brined or pickled eggs (marinated, hard-cooked eggs) keep for several months on the shelf.
From: Boiled egg
Comment: If the system could suggest three possible answers it might do better. For example, the boiled egg article does have this sentence, which would have been a nice answer: Boiled eggs are typically boiled for 10-11 minutes or until they float, depending on the size of the egg.

Q: What is the longest river in the world?
A: This is a list of the longest rivers on Earth. It includes river systems over 1,000 kilometers.
Comment: This might have been a good answer if it had not been truncated.

Q: What is the longest river in Africa
A: Nile.
From: Wikipedia:Nile
Comment: Good answer.

Q: What is the longest river in Africa ?
A: The length of a river between source and mouth may be hard to determine because of a lack of precise maps. In these cases, the measured length of a river will depend on the scale of the map on which the measurement is based; in general, due to the fractal quality of a river, the larger the scale, the longer the resulting length measurement.
From: List of rivers by length
Comment: adding the question mark changed the answer. I have no idea why.

Q: what is the longest river in Africa?
A: For example, although the Congo River appears to be a natural geographic boundary, there were groups that otherwise shared a language, culture or other similarity living on both sides. The division of the land between Belgium and France along the river isolated these groups from each other.
C: Yet another answer. Maybe it selects a random one from the top results.

Q: does P equal NP
A: A nurse practitioner is a registered nurse who has completed specific advanced nursing education (generally a master’s degree) and training in the diagnosis and management of common medical conditions. Nurse practitioners provide a broad range of health care services.
Comment: Maybe it’s a dodge.

Q: Who directed “They live by Night”?
A: Director: Nicholas Ray
Comment: That’s the right answer for a somewhat obscure film, which was Ray’s directing debut.

Q: Will there be peace in our lifetime?
A: No answer found for the question “Will there be peace in our lifetime?”
Comment: No comment.

Q: What is the value of pi?
A: Pi or Ï€ is the ratio of a circle’s circumference to its diameter in Euclidean geometry, approximately 3.14159.
Comment: correct, but an easy question.

Q: What is the square root of 16?
A: Year 16 was a leap year starting on Wednesday (link will display the full calendar) of the Julian calendar.
From: 16
Comment: this is a nice example that shows a fundamental weakness. AskWiki can only look for an answer that seems appropriate from among its several billion sentences. Even though that’s a large number, the number of ‘facts’ that people know is infinite.

Q: What is an emu
A: The Emu (pronunciation:), Dromaius novaehollandiae, is the largest bird native to Australia and the only extant member of the genus Dromaius. It is also the second-largest extant bird in the world by height, after its ratite relative, the ostrich. The soft-feathered, brown, flightless birds reach up to 2 m (6 ft) in height. The Emu is common over most of mainland Australia, although it avoids heavily populated areas, dense forest and arid areas. Emus can travel great distances at a fast, economical trot and, if necessary, can sprint at 50 km/h (30 mph) for some distance at a time.
From: Emu

Q: Can an emu fly?
A: The pelvic limb muscles of Emus have a similar contribution to total body mass as the flight muscles of flying birds.
From: Emu
Comment: While not a good answer, the sentence does at least imply that emus can not fly.

Q: How many popes have there been
A: Where there are sins, there are also divisions, schisms, heresies and disputes. Where there is virtue, however, there also are harmony and unity, from which arise the one heart and one soul of all believers (Cf CIC, can751.).
From: Pope
Comment: Inscrutable answer.

Q: Is there a god
A: Many historians of religion hold that monotheism may be of relatively recent historical origins – although comparison is difficult as many religions claim to be ancient.
Comment: Hmmmm…

Q: Where is Tanzania
A: Tanzania officially the United Republic of Tanzania (Swahili: Jamhuri ya Muungano wa Tanzania), is a country in East Africa bordered by Kenya and Uganda on the north, Rwanda, Burundi and the Democratic Republic of the Congo on the west, and Zambia, Malawi and Mozambique on the south. To the east it borders the Indian Ocean.
Comment: good answer

Q: Where is Brown University
A: Brown University is a private university located in Providence, Rhode Island. Founded in 1764 as the College of Rhode Island, it is the third-oldest institution of higher education in New England and the seventh-oldest in the United States. It is a member of the Ivy League. Pembroke College, Brown University’s all women’s college, merged with The College, in 1971.
From: Brown University
Comment: Also good.

Q: Why is the sky blue?
A: The light from the sky is a result of diffuse sky radiation which results in a light blue color being perceived by the human eye. On a sunny day the earth sky usually looks as a blue gradient — dark in the zenith, light near the horizon (due to Rayleigh scattering). It can turn a multitude of colors such as purple (especially near sunset and sunrise) and black at night. Scattering effects also partially polarize light from the sky.
From: Wikipedia:Sky
Comment: Not bad for this old chestnut.

Q: When was Charles Darwin born
A: Birth date: February 12, 1809
From: Charles Darwin
Comment: good!

Q: What is the capital of Belgium?
A: Capital: Brussels
From: Belgium
Comment: Right!

Q: What is the population of Brussels?
A: Population metro: 1,975,000
From: Brussels
Comment: Right!!

Q: what is the population of the capital of Belgium?
A: Population metro: 1,975,000
From: Brussels
Comment: I expected this one to fail!

Q: Is no your answer to this question?
A: This depends on the context, as do several other possibilities: Sometimes “I don’t know” is an acceptable answer, sometimes even a right answer. The same is true of “None of the above” and “There is no answer”. An answer is the, or a, right answer, if it presents true information which falls within the determined range of alternatives. Questions of this simplest sort usually begin with Who, what, which, where, when, does/do, is/are.
From: Question
Comment: Wow!

Q: is waterboarding torture?
A: Recent torture methods include waterboarding, sexual humiliation and sexual abuse, and the use of dogs against prisoners.
From: Torture
Comment: No moral confusion here!

Q: What is the airspeed velocity of an unladen swallow?
A: Monty Python and the Holy Grail is a 1975 film written and performed by the comedy group Monty Python (Graham Chapman, John Cleese, Terry Gilliam, Eric Idle, Terry Jones, and Michael Palin), and directed by Gilliam and Jones. It was made during a gap between the third and fourth seasons of their popular BBC television series Monty Python’s Flying Circus.
From: Wikipedia:Monty Python and the Holy Grail
Comment: Maybe the selection process is done in two steps: find a relevant article and then find a good answer within it.

Q: Can a computer think?
A: However, computers cannot “think” for themselves in the sense that they only solve problems in exactly the way they are programmed to.
Comment: This answer must have been painful for poor old AskWiki to give.

Q: who is jimbo wales?
A: Jimmy Donal “Jimbo” Wales (born 7 August 1966 in Huntsville, Alabama) is an American Internet entrepreneur known for his role in founding Wikipedia and other wiki-related
Comment: Know your master.

Q: Is jimbo wales a genius or what?
A: No answer found for the question “Is jimbo wales a genius or what?”
Comment: Ahhh, AskiWiki may be showing some tact.

