Adding emotions to synthesized Malay speech using diphone-based templates / Syaheerah Lebai Lutfi

Syaheerah , Lebai Lutfi (2007) Adding emotions to synthesized Malay speech using diphone-based templates / Syaheerah Lebai Lutfi. Masters thesis, University of Malaya.

[img]
Preview
PDF (Thesis M.A)
Download (1369Kb) | Preview

    Abstract

    This study describes the addition of an affective component to the Malays TTS system in order to produce a system that is more expressive in nature. It introduces a new method for generating expressive speech by embedding an ‘emotion layer’ called eXpressive Text Reader Automation Layer, abbreviated as eXTRA. The emotion generation method is template-driven. The templates are diphone-based and each template carries unique affective data. The two types of emotions created for the system are anger and sadness. To ensure naturalness, the input sentence from user is matched with the template that consist of a sentence with the same syllable structure of the input sentence, allowing the emotion parameters from the template to be applied to the input at the level of phonemes. This syllable-sensitive matching process requires analysis of each syllable's consonant or vowel pattern. The module is an independent component that can serve as an extension to any Malay TTS system that uses Multiband Resynthesis Overlap Add (MBROLA) engine for diphone concatenation. In a pilot project, the prototype is used with Fasih, the first Malay Text-to-Speech system developed by MIMOS Berhad, which can read unrestricted Malay text. eXTRA is evaluated through perception tests. The results show more than sixty percent of recognition rate, which confirm the satisfactory performance of this approach. .The solution should provide improvement to output of Malay TTS system.

    Item Type: Thesis (Masters)
    Additional Information: Dissertation (M.A.) – Faculty of Computer Science & Information Technology, University of Malaya, 2007.
    Uncontrolled Keywords: Diphone-based templates; Malay Text-to-Speech system; MIMOS Berhad; Pitch pattern; Human speech tones
    Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
    Q Science > QA Mathematics > QA76 Computer software
    Divisions: Faculty of Computer Science & Information Technology
    Depositing User: Mr Mohd Safri Tahir
    Date Deposited: 26 Aug 2020 03:00
    Last Modified: 26 Aug 2020 03:00
    URI: http://studentsrepo.um.edu.my/id/eprint/11579

    Actions (For repository staff only : Login required)

    View Item