Leyla, Roohisefat (2014) Neural response based speaker identification under noisy condition / Leyla Roohisefat. Masters thesis, University of Malaya.
Abstract
Speaker identification is the mechanism of determining a person among a set of speakers to certify whether that person is who he claims to be. The available speaker identification systems are mostly based on the acoustical signal itself. The problem is that they are very sensitive to noise and can work only at very high signal-to-noise ratio (SNR). However, neural responses are very robust against background noise. In this study, a wellknown model of the auditory periphery by Zilany and colleagues (J. Acous. Soc. Am., 2009) is employed to simulate the neural responses, known as neurogram, on identifying a speaker, and then average discharge rate or envelope (ENV) and the temporal fine structure (TFS) are computed from the neurogram. The resulted vectors are used to train the system by employing two types of classifiers, Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM). The database consists of text-dependent speech samples from 39 speakers, and 10 speech samples were recorded for each speaker in a quiet room. The performance of the proposed method is compared with the traditional acoustic feature (mel-frequency-cepstral-coefficient, MFCC) based speaker identification method for both under quiet and noisy conditions. As the neural responses are robust to noise, the proposed neural response based system using TFS responses performs better than MFCC-based method, especially under noisy conditions. In general, GMM shows better accuracy for the proposed method than using HMM as a classifier.
Actions (For repository staff only : Login required)