Bassam Ali Qasem, Al-Qatab (2020) An intra-severity classification and adaptation technique to improve dysarthric speech recognition accuracy / Bassam Ali Qasem Al-Qatab. PhD thesis, Universiti Malaya.
PDF (The Candidate's Agreement) Restricted to Repository staff only Download (222Kb) | |
PDF (Thesis PhD) Download (2157Kb) |
Abstract
Dysarthria is a motor speech impairment at the neurological and/or muscular levels that caused difficulty in pronouncing words clearly. Automatic speech recognition (ASR) system is increasingly applied as assistive technology to aid an individual with physical disability particularly the speech impaired community such as dysarthria speakers. However, the development of an effective ASR system is hindered by the data sparsity, either in the coverage of the language or the size of the existing speech databases. The speaker adaptation (SA) technique is one of the solutions to overcome the data sparsity issue of ASR for dysarthric speakers. Our proposed method introduces the intra-severity classification and adaptation techniques which are applied sequentially in two stages of system development. Firstly, intra-severity classification intended to identify the level of severity of the dysarthric speakers. Secondly, the identified severity level of a particular dysarthric speaker in the first stage is applied to the corresponding intra-severity adaptation of dysarthric speech. For the classification part, there are six algorithms used to classify the intra-severity of dysarthric speakers. The algorithms include Linear Discriminant Analysis (LDA), Artificial Neural Network (ANN), Support Vector Machine (SVM), Naive Bayes (NB), Classification And Regression Tree (CART), Random Forest (RF). The Random Forest (RF) algorithm was proposed as a classifier for the intra-severity classification of the dysarthric speaker which has the lowest average ranking score as compared to other benchmark classifiers. The intra-severity adaptation of the ASR system was developed using two well-known adaptation techniques which are the Maximum Likelihood Linear Regression (MLLR) and Maximum A Posterior (MAP) as well as a combination of them. The results showed that the combination of MLLR+MAP adaptation outperforms all adaptation techniques with total improvement in Word Error Rate (WER) from 39.84% to 18.48% with 53.61% improvement from the baseline WER in the overall performance of the system. The total improvement of the WER based on severity level were 66.32%, 52.35%, and 45.20% for mild, moderate, and severe severity level respectively for the hybrid MLLR+MAP adaptation technique. The combination of the adaptation techniques in sequential order helps to take advantage of each adaptation technique and avoid the flaws of each technique in relation to adaptation data size.
Item Type: | Thesis (PhD) |
---|---|
Additional Information: | Thesis (PhD) – Faculty of Computer Science & Information Technology, Universiti Malaya, 2020. |
Uncontrolled Keywords: | Dysarthria; Automatic dysarhric speech recognition system; Classification algorithms; Adaptation techniques; Feature selection methods |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Computer Science & Information Technology |
Depositing User: | Mr Mohd Safri Tahir |
Date Deposited: | 22 Jun 2023 08:05 |
Last Modified: | 22 Jun 2023 08:05 |
URI: | http://studentsrepo.um.edu.my/id/eprint/14484 |
Actions (For repository staff only : Login required)
View Item |