DARPA speech-to-speech research

November 1st, 2006

PhraselatorToday’s Washington Post has an article, First Ears, Then Hearts and Minds, on DARPA’s continuing efforts to develop automatic, real-time spoken language translation. This is part of a 50+ year investment in developing human language technology, including, speech, by the DoD. A current diver, of course, is the extreme shortage of Arabic linguists, translators and interpreters in the military.

One recently deployed result is the Phraselator, a PDA like translation device developed by VoxTec, an Annapolis-based company. Early versions of the device were used in Afghanistan in 2001 and more recent ones are ion use in Iraq. VoxTec’s Phraselator is a one-way device recognizes a set of pre-defined phrases and plays a corresponding recorded translation. Since the speech, language and domain models are in software, it can be easily ported to new languages or domains.

The DoD is also using a similar translation device developed by Integrated Wave Technologies that lets one enter key phrases that are then turned into appropriate Arabic sentences.

“You say ‘house search’ and then it will say in Arabic: ‘We’re here to search your house. Please stay in this room. Do you have any weapons?’” said Tim McCune, the company’s president.

The limitation of both devices is that they are one-way — they do not allow a two-way conversation.

“In years past, there wasn’t a great need for the individual soldier to speak a foreign language to do his mission,” said Wayne Richards, branch chief for technology implementation at U.S. Joint Forces Command. But in Iraq and Afghanistan, soldiers are increasingly interacting with Iraqi civilians, giving advice at checkpoints or guidance during home searches, he said. During those door-to-door searches, the soldiers need to be able to calm them down and reassure them,” Richards said. “We’re fighting for hearts and minds. But if I can’t tell her, ‘Ma’am, please calm down,’ . . . that wouldn’t be assuring.”

DARPA has an ongoing research program, Translation System for Tactical Use (TRANSTAC) , in which IBM, SRI and CMU were recently funded to develop the next generation of portable speech to speech translation systems. SRI’s IraqComm, for example, “performs bidirectional, speech-to-speech machine translation between English and colloquial Iraqi Arabic.” IBM’s MASTOR is a software only solution that can run on a PDA or laptop computer and is designed as a “two-way, free form speech translator that assists human communication using natural spoken language for people who do not share a common language.” CMU’s project is developing a “two-way translation between English and Arabic Iraqi” and “investigating issues surrounding the rapid deployment of new languages, especially, low-resource languages and colloquial dialects.”

While progress in speech-to-speech translation is steady, it is also slow. It will be many years before we have the Universal Translator seen in Star Trek. Not only could that device handle virtually all alien languages, it could even communicate with non-biological life forms. It could not, however, talk to lawyers.