Kirjasto - Tampereen teknillinen yliopisto

Voice activity detection in noise robust speech recognition

Show full item record

Files in this item

Files Size Format View

There are no files associated with this item.

URN: http://URN.fi/URN:NBN:fi:tty-200907106636
Title: Voice activity detection in noise robust speech recognition
Author: Pasanen, Antti
Publication type: Diplomityö
Issue date: 2002-06-05
University: Tampereen teknillinen korkeakoulu
Faculty: Tietotekniikan osasto
Department: Signaalinkäsittelyn laitos
Abstract: In this thesis, Voice Activity Detection (VAD) algorithms are integrated in an ASR system. VAD is assumed to give additional information to the ASR system about the presence of speech, thus increasing the robustness of the ASR system. Two standard VAD algorithms (G.729b and GSM) are described and a statistical Gaussian Mixture Model (GMM) based VAD is introduced. For the GMM based VAD, different adaptation techniques are employed to track the changing background noise statistics. The VAD algorithms are integrated with the ASR system, with explicit and implicit approaches. The explicit approach means that the VAD is a separate module in the front end of the ASR system, while in the implicit approach the VAD decision is included in the decoding stage of the speech recognition unit. The performance of the VAD algorithms are compared directly using frame classification rates and indirectly using recognition rates. Recognition is performed as a small vocabulary isolated word recognition task with a Hidden Model based ASR system using normalized Mel-frequency cepstral coefficients. According to our simulations, ideal information about the word boundaries increases significantly the recognition accuracy of the ASR system. However, the described VAD algorithms were not able to increase the recognition accuracy significantly. /Kir10


This item appears in the following Collection(s)

Show full item record

Search TUT DPub


Advanced Search

Browse

My Account

Statistics