Microsoft demonstrates live speech translation

Published : Nov 12, 2012 - 16:45 Updated : Nov 12, 2012 - 16:50

Microsoft unveiled speech translation technology that turns English into Chinese instantly.

The software preserves intonation and cadence so the translated speech still sounds like the original speaker.

Microsoft chief researcher officer Rick Rashid demonstrated details of the technology in a blog post following a presentation he gave in Tianjin, China, in late October.

He said that the demo is the result of 60 years of work and some breakthroughs in the last two years. These breakthroughs, he said, have helped take the software away from the error rates of 20-25 percent that it previously suffered.

“Just over two years ago. Researchers at Microsoft Research and the University of Toronto made another breakthrough. By using a technique called Deep Neural Networks, which is patterned after human brain behavior, researchers were able to train more discriminative and better speech recognizers than previous methods,” he wrote on his blog.

The speech-to-speech technology used by Rashid during his presentation consisted of a two-step translation system. In the first stage, the audio of his speech was translated into English text and then converted into Chinese and the words reordered so they made sense. Then, the Chinese characters were piped through a text-to-speech system to emerge sounding like Rashid.

“Of course, there are still likely to be errors in both the English text and the translation into Chinese, and the results can sometimes be humorous,” said Rashid in the blog post. “Still, the technology has developed to be quite useful.”

From news report
(khnews@heraldcorp.com)

<관련 한글 기사>

MS, '억양까지 그대로' 혁신적인 통역 시스템 선보여

마이크로소프트(MS)가 영어로 말하면 이를 거의 즉시 중국어로 통역하는 소프트웨어(SW)를 선보였다고 BBC가 11일 보도했다.

마이크로소프트는 연구를 통해 즉시 통역시스템의 오류를 최소화했고 정확성을 높이기 위해 뇌가 작동하는 방식으로 시스템을 구축했다고 밝혔다.

이 소프트웨어는 말하는 사람의 억양과 음조도 그대로 유지하고 있어 번역된 말 조차 원래 이야기한 사람의 음성처럼 들린다.

지난 10월말 중국 톈진(天津)에서 직접 소프트웨어를 시연했던 마이크로소프트 연구소장인 릭 라시드는 자신의 블로그를 통해 즉시 통역 소프트웨어의 자세한 내용 을 공개했다.

그는 시연회의 말미에 MS의 통역시스템을 통해 영어로 말을 하고 이것이 거의 즉시 중국어로 통역되는 것을 선보였으며 그의 말하는 억양이나 어조가 그대로 유지 됐다.

그는 지난 2010년부터 MS 연구원들은 인간의 뇌가 소리를 인식하는 방식을 알고 자 캐나다 토론토대학 과학자들과 뇌신경계를 모델로 한 정보 처리 시스템을 연구하 기 시작했으며 이 기술을 통해 통역 오류를 기존 20-25%에서 15%로 대폭 줄였고 앞 으로 더 줄어들 것이라고 말했다.

이 통역 소프트웨어는 우선 영어로 말을 하면 이를 일단 영어 텍스트로 옮기고 이를 중국어 텍스트로 전환한 다음 말하는 사람의 어조나 억양을 담아 중국어로 이야기하는 구조로 돼 있다.

한편 구글이나 AT&T 등도 이와 유사한 연구를 하고 있으며, 일본의 NTT도코모는 최근 일본인들이 외국인들과 통화하면서 일본어로 말하면 이를 통역해주는 스마트폰 애플리케이션을 선보인 바 있다.