UMBC ebiquity

Machine aided human translation - An automated system for transcribing dictated document translations

Speaker: Richard Rose

Start: Friday, October 08, 2010, 11:00AM

End: Friday, October 08, 2010, 12:00PM

Location: ITE 325

Abstract: A model is presented for machine aided human translation (MAHT) that integrates source language text and target language acoustic information to produce the text translation of source language document. It is evaluated on a scenario where a human translator dictates a first draft target language translation of a source language document. Information obtained from the source language document, including translation probabilities derived from statistical machine translation (SMT) and named entity tags derived from named entity recognition (NER), is incorporated with acoustic phonetic information obtained from an automatic speech recognition (ASR) system. One advantage of the system combination used here is that words that are not included in the ASR vocabulary can be correctly decoded by the combined system. The MAHT model and system implementation will be described. It is shown that a relative decrease in word error rate of 29% can be obtained by this combined system relative to the baseline ASR performance on a French to English document translation task in the Canadian Hansard domain. An attempt is also made to demonstrate the effect of the system combination techniques on machine translation performance as measured by standard translation error metrics.

Web Site: http://www.cs.umbc.edu/CSEE/events/0910/rose.10.8.10.html

Tags: spoken language, speech, machine translation, natural language processing, asr, automatic speech recognition

,