Mohamed Elhag , Mohamed Abo (2022) A translation model using hybrid approaches to improve Arabizi dialect sentiment analysis in business tweet’s / Mohamed Elhag Mohamed Abo. PhD thesis, Universiti Malaya.
PDF (The Candidate's Agreement) Restricted to Repository staff only Download (200Kb) | |
PDF (Thesis PhD) Restricted to Repository staff only until 31 December 2023. Download (2129Kb) |
Abstract
The most popular electronic communication mediums are microblogs and social media platforms. For social media platforms such as Twitter, important papers representing users' thoughts and viewpoints are produced and traded on a daily basis. Recently, sentiment analysis has brought great opportunities to businesses and services’ providers who are interested in tracking and monitoring reputations of their brands and to policymakers whose supports are needed for assessment for public opinions about their brands, services, and/or policy issues. Various sentiment analysis tools for Twitter and other similar microblogging networks have lately been developed. Most of these models rely mainly on the presence of effect words or syntactic structures that explicitly and unambiguously reflect sentiment. However, these models are weak; they do not work accurately for the Arabizi text, emojis, ambiguity of Arabic, and they ignore emotional words that are less than three letters, when detecting their sentiments in a text. This research investigates the Arabic dialect text in sentiment analysis of microblogs, which is aimed at addressing the problem in the service domains such as hotels, restaurants, and transportation. Twitter data is used as a study dataset of microblogging platforms to investigate whether capturing the sentiment of words concerning Arabizi, emojis, less than three letters of the emotional word has any effect, and that disambiguating leads to more accurate sentiment analysis models on Twitter. A hybrid translation model is proposed to address these issues for extracting, detecting, annotating, and translating language types of sentence semantics for sentiment analysis. The experiments were conducted in two stages. The first stage was about language detection, Arabizi, and emoji translation, by evaluating their impacts on popular sentiment analysis tasks on Twitter. The second stage was about ambiguity-sensitive sentiment of a dictionary adaptation. Three sentiment dictionaries and up to five Twitter datasets of various features were evaluated under each sentiment analysis task, and comparisons to numerous state-of-the-art sentiment analysis were made, using machine- learning classifiers, which were extensively discussed in the literature. The findings of this body of work demonstrate the importance of using the detection of language and translated Arabizi, emoji, and emotional words that are less than three letters of sentiment analysis on business Twitter. The proposed model focuses on the Saudi’s Arabic dialect for sentiment analysis at sentence levels. It surpasses other Arabic dialect and modern standard Arabic models in most datasets. However, the evaluation showed that the results of proposed framework has better result than the five-dataset results that were applied in the same environment. Furthermore, the Naïve Bayes performed the best result, with 94%, 94%, 89%, and 91% for accuracy, precision, recall, and f-measure, respectively.
Item Type: | Thesis (PhD) |
---|---|
Additional Information: | Thesis (PhD) – Faculty of Computer Science & Information Technology, Universiti Malaya, 2022. |
Uncontrolled Keywords: | Arabizi; Arabic dialect; Machine learning; Tweet; SaudiZi |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Computer Science & Information Technology |
Depositing User: | Mr Mohd Safri Tahir |
Date Deposited: | 28 May 2023 07:12 |
Last Modified: | 28 May 2023 07:12 |
URI: | http://studentsrepo.um.edu.my/id/eprint/14426 |
Actions (For repository staff only : Login required)
View Item |