Comparative study on Malay children vowel recognition using multi-layer perceptron and recurrent neural networks / Afshan Kordi

Afshan, Kordi (2012) Comparative study on Malay children vowel recognition using multi-layer perceptron and recurrent neural networks / Afshan Kordi. Masters thesis, University of Malaya.

PDF (Thesis M.A)
Download (2384Kb) | Preview


    Speech recognition has become popular during recent decades due to its widespread applications such as telephone systems, health care domain, data entry, speech to text processing, biometric systems, training air traffic controllers and so on. Among the technologies that have been investigated in acoustic modeling of speech, Artificial Neural Networks (ANN) have received interest from many researchers as they have shown good results in pattern recognition specially in classification. Despite of noteworthy progress in speech classification using neural networks, some unresolved issues still are remained in utilizing and performing the neural networks. Particularly less effort has been done on the speech of children which is more dynamic. There are numerous neural network architectures introduced by scientists that the most common sufficient for speech recognition include: Multi-Layer Perceptron (MLP) and Recurrent Neural Network (RNN). The purpose of this study is to compare the performance and recognition rate of these two types of neural networks in terms of signal length and number of hidden neurons for sustained Malay vowel among Malay children. Linear Predictive Coding (LPC) is used as a feature extractor to convert the speech signal into parametric coefficients. The Neural Network Toolbox™ (nntool) in Matlab® is used to classify the six Malay vowels (/a/, /e/ /ә/, /i/, /o/ and /u/) according to the 3-fold cross validation technique in different signal lengths with different number of hidden neurons. Experiments were done to compare the performance of the neural networks using single frame and multiple frame approach as well. The results show that longer signal lengths perform better than those in short signal lengths. The findings indicate that MLP and RNN reached a recognition rate of 83.79% and 83.10% respectively. Vowel /i/ got the highest recognition rate in both methods.

    Item Type: Thesis (Masters)
    Additional Information: Research Project (M.Eng.) - Faculty of Engineering, University of Malaya, 2012.
    Uncontrolled Keywords: Speech perception; Neural networks; Perceptrons; Signal processing; Signal processing; Biometric systems
    Subjects: T Technology > T Technology (General)
    T Technology > TA Engineering (General). Civil engineering (General)
    Divisions: Faculty of Engineering
    Depositing User: Mr Prabhakaran Balachandran
    Date Deposited: 18 Jul 2019 08:55
    Last Modified: 18 Jul 2019 08:55

    Actions (For repository staff only : Login required)

    View Item