12th IASTED International Conference on Signal and Image Processing, SIP 2010, Maui, HI, Amerika Birleşik Devletleri, 23 - 25 Ağustos 2010, ss.13-17
This paper introduces a system which is designed to perform a relatively accurate transcription of speech and in particular, continuous speech recognition based on triphone model for Turkish language. Turkish is generally different from Indo-European languages (English, Spanish, French, German etc.) by its agglutinative and suffixing morphology. Therefore vocabulary growth rate is very high and as a consequence, constructing a continuous speech recognition system for Turkish based on whole words is not feasible. By considering this fact in this paper, acoustic models which are based on triphones, are modelled as five state Hidden Markov Models (HMM). Mel-Frequency Cepstral Coefficients (MFCC) approach was preferred as the feature vector extraction method and training is done using embedding training that uses Baum-Welch re-estimation. Recognition is implemented on a search network which can be ultimately seen as HMM states connected by transitions and Viterbi Token Passing algorithm runs on this network to find the mostly likely state sequence according to the utterance. Also to make a more accurate recognition bigram language model is constructed.