Title page for 965303011


[Back to Results | New Search]

Student Number 965303011
Author Yu-sheng Pai(白育昇)
Author's Email Address baiyusheng@yahoo.com.tw
Statistics This thesis had been viewed 1460 times. Download 11 times.
Department Executive Master of Communication Engineering
Year 2008
Semester 2
Degree Master
Type of Document Master's Thesis
Language zh-TW.Big5 Chinese
Title A system for Keyword Spotting
Date of Defense 2009-05-26
Page Count 54
Keyword
  • HMM
  • HTK
  • keywrod
  • speech
  • spotting
  • Abstract This paper’s goal is to research voice reorganization technique and to develop a speech keyword spotting system which can be working on any operation system and have the feature of probability and easy to use. This system are consist of three part, voice data reading program and keyword spotting program are working in the Microsoft Windows XP SP system, and develop platform is Borland C++ Builder 5. Speech keyword reorganization program is developed by HTK 3.3 and working in the Linux Fedora 5system.
      In this system we use HTK to develop HMM and to build the acoustics model, and we use 411 syllables which is build by 21 initials and 36 finals to develop a acoustics model which HMM state and mixtures is 6 and 17. In this model the training speech detection ratio must reach 92%, false alarm rate must under 13%. In the practical keywod model speech material input experiment, the differential between detection ratio and false alarm ratio keep in 3%, and detection ratio must reach 89%, false alarm rate under 16%.
      Finally we will use this model to build a speech keyword spotting reorganization system, and we will design a human interface program to provide to the operator, so that they can easy to use this system.
    Table of Content 摘要I
    AbstractII
    誌謝III
    目錄IV
    附圖目錄VII
    表格目錄IX
    第一章 緒論1
    1.1 研究動機1
    1.2 研究目標1
    1.3 論文大綱2
    第二章 語音辨識基本技術4
    2.1 HTK工具簡介4
    2.1.1 HTK運作簡介5
    2.1.2 字典編輯與文法規則簡介5
    2.2 特徵參數擷取9
    2.2.1 語音特徵參數擷取步驟11
    2.3 隱藏式馬可夫模型13
    2.4 聲學模型15
    2.5 HMM訓練流程與演算法17
    2.5.1 訓練流程17
    2.5.2 維特比搜尋演算法18
    第三章 語音關鍵詞擷取系統建立19
    3.1 系統開發環境19
    3.2 系統架構19
    3.3 語料庫簡介20
    3.4 特徵參數抽取係數21
    3.5 語音辨識模型建立21
    3.5.1 聲學模型建立21
    3.5.2 關鍵詞模型建立22
    3.5.3 無關鍵詞模型建立22
    3.5 關鍵詞擷取架構23
    3.6 語音辨識使用HTK工具訓練24
    3.7 語音辨識使用HTK工具辨識27
    第四章 實驗與結果29
    4.1 實驗環境29
    4.1.1 實驗設備29
    4.1.2 實驗語料29
    4.2 擷取率與假警報率31
    4.3 關鍵詞擷取實驗31
    4.3.1 訓練語料之HMM狀態數與高斯混合數組合31
    4.3.2 非訓練語料實測36
    4.4 實驗方法與結果比較39
    4.4.1 語料庫比較39
    4.4.2 研究方法及相關參數比較40
    4.4.3 實驗結果比較41
    4.4.4 系統運作比較44
    4.5 系統實現46
    第五章 結論與未來展望50
    5.1 結論50
    5.2 未來展望50
    參考文獻52
    Reference [1] Steve Young, Gunnar Evermann, Thomas Hain, Dan Kershaw, Gareth Moore, Julian Odell, Dave Ollason, Dan Povey, Valtcho Valtchev, Phil Woodland, The HTK Book ( for HTK version 3.3), Cambridge University Engineering Department, 2005.
    [2] http://www.speech.kth.se/wavesurfer/index.html
    [3] L. R. Rabiner and B. H. Juang, “Fundamentals of speechrecognition,” Prentice Hall, New Jersey, 1993.
    [4] L. R. Rabiner and R. W. Schafer, “Digital processing of speech recognition signals,” Prentice-Hall Co. Ltd, 1978.
    [5] Steven Young, Gunnar Evermann, Dan Kershaw, Gareth Moore,Julian Odell, Dave Ollason, Valtcho Valtchev and Phil Woodland, The HTK Book (for HTK Version 3.1), Cambridge University Engineering Department, 2001.
    [6] L.R. Rabiner and B.H. Juang, Fundamentals of Speech Recognition, Prentice Hall, 1993.
    [7] Eric Chang, Jianlai Zhou, Shuo Di, Chao Huang, Kai-Fu Lee, “Large Vocabulary Mandarin Speech Recognition with Different Approaches in Modeling Tones”, International Conference on Spoken Language Processing, ICSLP’00, pp.983-976, 2000.
    [8] Tranzai Lee, Fang Zheng, Wenhu Wu, “Reference Point Alignment Frequency Warp Method for Speaker Adaptation”, International Conference on Signal Pocessings, ICSP’02, pp.756-759, 2000.
    [9] Xuedong Huang, Alex Acero, Hsiao-Wuen Hon, Spoken language processing, Prentice Hall, 2001.
    [10] Berlin Chen, Hsin-min Wang, Lee-feng Chien and Lin-shan Lee,“A*-admissible key-phrase spotting with sub-syllable level utterance verification,” in Proc. International Conference on Spoken Language Processing (ICSLP98), Sydney, Australia, Dec 1998.
    [11] 呂儲仰,國語連續音節辨認系統之改進與分析,國立交通大學碩士論文,2002。
    [12] 李健平,語音辨認應用於PDA 之作業控制研究,私立中原大學碩士論文,2001。
    [13] 許志文,“國語關鍵詞擷取與發音確認之研究",國立台灣大學碩士論文,中華民國八十九年。
    [14] 許勝銘,“大詞彙客語語音辨識系統之初步研究",國立台灣科技大學碩士論文,中華民國九十六年一月十七日。
    [15] 邱政湧,“標記傳遞模式應用於中文連續語音關鍵詞辨認系統",私立中原大學碩士論文,中華民國九十二年七月。
    [16] 郭智超,“以音節為基礎之中文語音文件檢索系統的研究”,國立清華大學碩士論文,中華民國九十二年六月。
    [17] 蔡炎興,“關鍵詞萃取及語者辨識系統之研製”,國立中央大學碩士論文,中華民國九十二年六月。
    [18] 楊景嵐,“電話語音應用整合語者辨識與關鍵詞萃取”,國立中央大學碩士論文,中華民國九十三年六月。
    [19] 張展嘉,“自由音節解碼在全文資訊檢索及語句辨識上之應用”,國立清華大學碩士論文,中華民國八十九年。
    Advisor
  • none(林嘉慶)
  • none(蔡木金)
  • Files
  • 965303011.pdf
  • disapprove authorization
    Date of Submission 2009-07-08

    [Back to Results | New Search]


    Browse | Search All Available ETDs

    If you have dissertation-related questions, please contact with the NCU library extension service section.
    Our service phone is (03)422-7151 Ext. 57407,E-mail is also welcomed.