Mwenge , Mulenga (2022) Deep learning-based colorectal cancer classification using augmented and normalised gut microbiome data / Mwenge Mulenga. PhD thesis, Universiti Malaya.
PDF (The Candidate's Agreement) Restricted to Repository staff only Download (188Kb) | |
PDF (Thesis PhD) Restricted to Repository staff only until 31 December 2024. Download (2605Kb) |
Abstract
Colorectal cancer is the third most deadly cancer worldwide. The use of gut microbiome in early detection of the disease has attracted much attention from the research community due to its non-invasive nature. Recent achievements in next generation sequencing technology that have resulted in an increased availability of sequence data have also created an enabling environment for the growth of the gut microbiome research area. At the same time, there has been growing interest from the research community in machine learning based detection of diseases using sequence based on gut microbiome data. The detection of colorectal cancer using this approach offers a non-invasive alternative in colorectal cancer research where data can be obtained from stool samples. Considering the limitations of existing methods for colorectal cancer detection, such as colonoscopy and faecal occult blood test, the medical research community has adopted the use of sequence data to identify the disease. While the complex relations that exist between the microbiome and host phenotypes make machine learning algorithms suitable for analysing the microbiome data, deep learning methods are becoming more popular due to their outstanding performance in related fields. However, the performance of deep learning methods is also affected by limitations such as dimensionality, sparsity, and feature dominance inherent in microbiome data. Therefore, to address the above-mentioned limitations in deep learning classification of colorectal cancer based on gut microbiome data, three objectives were formulated. First, to investigate the methods used to address limitations associated with microbiome-based datasets in colorectal cancer identification using deep neural network algorithms. Second, to develop novel techniques that combine the strengths of normalisation, feature engineering and data augmentation to address the problem of dimensionality, feature dominance and sparsity in colorectal cancer identification based on gut microbiome data, using deep neural network algorithms. Third, to evaluate the proposed techniques using the benchmark datasets and compare the results with the baseline methods. Consequently, the techniques for combining existing normalisation methods, namely feature extension, chaining and stacking were proposed in the research. Based on the results, the proposed techniques significantly outperform baseline methods. The research shows that a model that addresses dimensionality, feature dominance and sparsity produce outstanding prediction results in colorectal cancer identification using high sequence-based gut microbiome data. The improved results due to the proposed techniques could aid the growth of the research field and beyond.
Item Type: | Thesis (PhD) |
---|---|
Additional Information: | Thesis (PhD) – Faculty of Computer Science & Information Technology, Universiti Malaya, 2022. |
Uncontrolled Keywords: | Deep learning; Colorectal cancer; Microbiome; Normalisation; Augment; Stacking; cChaining; Feature extension |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Computer Science & Information Technology |
Depositing User: | Mr Mohd Safri Tahir |
Date Deposited: | 15 May 2023 04:13 |
Last Modified: | 15 May 2023 04:13 |
URI: | http://studentsrepo.um.edu.my/id/eprint/14415 |
Actions (For repository staff only : Login required)
View Item |