Kirjasto - Tampereen teknillinen yliopisto

Contributions to Multilingual Low-Footprint TTS System for Hand-Held Devices

Show simple item record

Title: Contributions to Multilingual Low-Footprint TTS System for Hand-Held Devices
Author: Moberg, Marko
Abstract: Speech technology in the form of automatic speech recognition (ASR) and speech synthesis (text-to-speech or TTS) has become common in everyday use. Applications such as in-car navigation, hands-free control of devices, aids for visually impaired people, telephone-based schedule and reservation services, military applications and even some dictation applications can be found on the market today. The advancement of technology has made it possible to provide voice-based applications also on smaller, hand-held devices.

This thesis focuses on describing the challenges and solutions in optimizing a multilingual text-to-speech (TTS) system for hand-held devices. The challenges in development are introduced by the mismatch between the application requirements and available resources. The requirements include application features, speech quality, language support and portability. The main resources are memory size, performance, development time and cost. The trade-off between requirements and resources are especially challenging in cost-optimized embedded devices targeted for consumer market.

In this thesis, a multilingual TTS framework is designed and optimized to meet the application requirements according to the availability of various resources. The TTS system uses a Klatt88 formant synthesizer or unit selection synthesis depending on the configuration. Novel approaches and improvements are applied to text normalization, text-to-phoneme conversion, control of synthesis parameters, system framework, and development tools.

It is shown that commercially viable multilingual TTS-based applications can be created by the following four main methods. First, the limitations of the TTS technology can be hidden or alleviated by limiting the scope of the application. Second, optimization in memory consumption and performance makes the TTS technology more attractive for hand-held devices. Third, multilingualism and rapid development of new synthesis languages are enabled through system design and proper development methods and tools. Finally, the separated TTS engine software and language dependent data make it possible to hide the software engineering details by providing language developers an interface with a higher level of abstractness.

Issue date: 2007-08-17
ISBN: 978-952-15-1798-3
ISBN (PDF): 978-952-15-1813-3
ISSN: 1459-2045
Belongs to: Tampereen teknillinen yliopisto. Julkaisu - Tampere University of Technology. Publication; 671
URN: http://URN.fi/URN:NBN:fi:tty-200810021118
Publication type: Doctoral dissertation
Language: en
University: Tampere University of Technology
Faculty: Department of Information Technology
Department: Institute of Signal Processing
Copyright: This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.

Files in this item

Files Size Format View
moberg.pdf 944.8Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record

Search TUT DPub


Advanced Search

Browse

My Account

Statistics