Wandeep Kaur , Ratan Singh (2020) Multi-tier classification based on sentiment, type, emotion and purpose for online diabetes community / Wandeep Kaur Ratan Singh. PhD thesis, Universiti Malaya.
PDF (The Candidate's Agreement) Restricted to Repository staff only Download (200Kb) | |
PDF (Thesis PhD) Download (2011Kb) |
Abstract
The evolution of social media platforms has created a niche for users to increasingly turn to such sites in order to share and exchange health related information. Facebook being one of the largest social networking sites has only encouraged such exchange thus mounting to a sheer amount of data that is hidden within unstructured text. The aim of this research is to propose a multi-tier classification based on sentiment, type, emotion and purpose (STEP) to classify data collected from diabetes community within Facebook. There are three tiers within the proposed STEP framework namely type, purpose and sentiment (and emotion within same tier). The first tier looks into the classification of type of diabetes. Here a manual type lexicon dictionary catering for all three forms of diabetes (type1, type 2 and gestational diabetes) was created. Naïve Bayes using n-gram was used for classification purpose where the proposed STEP framework was able to produce a F1-Score of 77% against benchmark models. Posts that could not be classified into any one type were grouped under Other while the correctly classified posts from this tier moved down to the next tier for purpose classification. In the next tier, posts were classified according to symptoms, lifestyle and treatment. A weighted information gain feature selection technique was adopted where weights were redistributed for those features that have been wrongly classified within the training phase. Co-training multinomial Naïve Bayes was used where the two base classifiers were used for both label and feature classification. The uniqueness lies in using dimensionality reduction technique of converting numeric vectors to string vectors using Word2Vec that improved F1-Score of 61% compared to only 48%. The last tier in the proposed STEP framework looked into sentiment and emotion classification. Here a mathematical equation was proposed to calculate sentiment intensity using Facebook behaviors of like, comment, share and reaction. Studies in the past have looked to analyze the use of this behaviors and how they impact sales, however, the attempt made in this research is to convert those numbers to intensity which could be used to better classify sentiment. Results show proposed sentiment classifier was able to produce better classification of F1-Score 84%. Emotion classification was also conducted within the same tier where Word2Vec common bag of words model was adopted using bootstrapping methodology. A similarity check between annotated corpus and Emolex determined the dominant emotion and thus classified post accordingly. This improved the classification process from detecting multiple emotion per post to classifying the most dominant emotion extracted from post. The proposed framework was able to improve overall classification accuracy within each of its tiers and using a multi-tier framework, it was able to remove posts that do not contribute towards classification within the upper layers thus contributing to a more refined dataset for classification within its lower tiers. Keywords:
Item Type: | Thesis (PhD) |
---|---|
Additional Information: | Thesis (PhD) – Faculty of Computer Science & Information Technology, Universiti Malaya, 2020. |
Uncontrolled Keywords: | Multi-tier, Sentiment; Emotion; Annotated corpus; Facebook |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4050 Electronic information resources |
Divisions: | Faculty of Computer Science & Information Technology > Dept of Information System |
Depositing User: | Mr Mohd Safri Tahir |
Date Deposited: | 19 Feb 2024 07:28 |
Last Modified: | 19 Feb 2024 07:28 |
URI: | http://studentsrepo.um.edu.my/id/eprint/14824 |
Actions (For repository staff only : Login required)
View Item |