One of the biggest problems facing the European Union today is the fact that within its borders, 23 languages are spoken. This means that all the important documents have to be translated by a whole army of translators, which costs the taxpayer more than 1 billion Euros a year – and companies trading within the EU spend millions more. The EU-funded TC-STAR project aims to tackle this issue with technology: a system that eats speech in one language, and outputs that same speech in another.Speech-to-speech translation is one of the most difficult language-related activities you can engage in. I study English in Amsterdam, and a part of that study is of course spent on translating Dutch material to English; even though we focus on text-to-text translations, we’ve been given glimpses of speech-to-speech translation as well, and this has made me aware of the incredible knowledge and mental agility required to do this. You need to know both the source as well as the target language inside-out, and especially when the translating is done ‘live’, you need to be able to keep up with the speakers. Despite all this, translating can be a very soothing activity; I find almost nothing as comfortable as sitting behind my computer, with several paper dictionaries scattered around my keyboard, stumbling from fixed expression to fixed expression, from saying to saying.
According to ICT Results, the TC-STAR project, “the first project in the world addressing unrestricted speech-to-speech translation”, needs to perfect three key technologies in order to operate properly: Automatic Speech Recognition (ASR) transcribes the spoken words to text, while Spoken Language Translation (SLT) translates the source language to the target language. Text to Speech (TTS) finalises the process by turning written words into speech.
While none of these technologies are new, none of them are anywhere near perfect. In order to optimise the output of each of the three technologies, the TC-STAR project combined several ASR and SLT systems, which made the output considerably more accurate. The system translated speech between Spanish and English, as well as radio broadcasts from Chinese to English (which is probably all the more impressive).
Based on the BLEU (Bilingual Evaluation Understudy) method, a way of comparing machine and human translations, evaluations of the quality of translations improved by between 40% and 60% over the course of the project, while up to 70% of words were translated correctly, even if they were not placed in the right position in a sentence.
The system obviously still cannot match human translators, but the TC-STAR project states that within a few years, a commercially viable automatic speech-to-speech translator might become available. Until then, the components of the project have been released under an open-source license.