Discriminative feature representation for Malay children’s speech recognition / Seyedmostafa Mirhassani

Mirhassani, Seyedmostafa (2015) Discriminative feature representation for Malay children’s speech recognition / Seyedmostafa Mirhassani. PhD thesis, University of Malaya.

[img] PDF (Full Text)
Restricted to Repository staff only until 01 January 2018.

Download (4Mb)

    Abstract

    This thesis examines methods to represent discriminative Cepstral speech features for automatic recognition of children’s speech. Automatic recognition of children’s speech is essential in computer-based speech therapy system. Current automatic speech recognition (ASR) systems adapted to adult’s speech are inefficient for recognition of children’s speeches because they suffer from lack of discriminative speech features. Most ASR approaches have several preprocessing stages that perform homogenous measurements for the acoustic-phonetic modeling, known as feature extraction. Most of the feature extraction methods employ filterbank for measuring the spectral information of a speech samples. This thesis addresses problem of children’s speech recognition by providing discriminative feature representation. For realizing the discriminative feature representation three methods are proposed. The first method is fuzzy feature selection which selects discriminative features based on fuzzy codification of features’ goodness and measures the linear independence of features. The second method performs optimization of filterbanks in cepstral feature extraction based on evolutionary algorithms. This method increases the discriminative power of features in representing the phoneme classes. Optimization is performed to provide a single filterbank and multiple complimentary filterbanks. In case of multiple filterbanks the cepstral features are used in different experts for performing classification based on different representation of speeches. Subsequently fuzzy fusion of decisions is carried out to obtain an overall decision for the class of the processing speeches. This method is capable of aggregating information obtained with different filterbanks for speech recognition under noisy environments. Third method is a combination of the first and the second methods based on a hierarchical classification manner. In this method Malay phonemes are divided into 5 groups and selection of discriminative features as well as optimization of filterbanks for each group of phonemes is carried out. Subsequently another filterbank is optimized for making discrimination between different groups. In the next step, based on the cepstral features provided by the filterbanks a hierarchical phoneme classification is performed. Systems using the provided features were evaluated in phoneme recognition/classification task. Three speech databases were used for the experiments including prolonged Malay vowels and Malay continuous speech database based on children’s speech and TIMIT database based on adult speeches. For comparison of the results a baseline system based on standard MFCC and HMM based phoneme recognition/classification was prepared. The proposed methods improved recognition rate of Malay phonemes over the standard MFCC. Improvement of phoneme recognition based on the proposed fuzzy feature selection, filterbank optimization based on single filterbank, and multiple filterbanks methods for Malay vowels was 3.59%, 4.86% and 5.28, respectively. Phonetic classification based on the proposed hierarchical method was improved by 8.51% over the standard MFCC. Based on the adult’s speech recognition experiments conducted on TIMIT Corpus, the improvement values of phonetic recognition derived by the proposed fuzzy feature selection, single and multiple filter bank optimization, and hierarchical phonetic classification methods were 1.16%, 1.64%, 2.11% and 2.58%, respectively. The classification of 5 selected phonemes from TIMIT was improved based on the multiple filterbank optimization method up to 4.69% under noisy condition.

    Item Type: Thesis (PhD)
    Additional Information: Thesis (Ph.D.) -- Faculty of Engineering, University of Malaya, 2015
    Uncontrolled Keywords: Discriminative feature; Representation; Malay; children’s speech; Recognition
    Subjects: R Medicine > R Medicine (General)
    Divisions: Faculty of Engineering
    Depositing User: Miss Dashini Harikrishnan
    Date Deposited: 19 Oct 2015 15:56
    Last Modified: 19 Oct 2015 15:56
    URI: http://studentsrepo.um.edu.my/id/eprint/5897

    Actions (For repository staff only : Login required)

    View Item