Loo , Wei Kit (2024) Hospital readmission risk prediction of COVID-19 patients using machine learning / Loo Wei Kit. PhD thesis, Universiti Malaya.
PDF (The Candidate's Agreement) Restricted to Repository staff only Download (124Kb) | |
PDF (Thesis PhD) Restricted to Repository staff only until 31 December 2025. Download (1783Kb) |
Abstract
Coronavirus disease (COVID-19) is evolving rapidly and caused the rise in hospital readmission. To mitigate the rate of hospital readmission, a retrospective study was carried out on 1578 COVID-19 patients admitted in Universiti Malaya Medical Centre (UMMC) from May 2020 to January 2022. This study aimed to utilize the technology of machine learning and deep learning in the prediction of readmission risk with three main objectives, to identify potential clinical risk factors leading to COVID-19 readmission, build a predictive model to prognosticate unplanned hospital readmission, and lastly to analyse the characteristics, duration of treatment and recovery rate of readmitted COVID-19 patients in Malaysia. This study consists of three phases, commencing with the preliminary stage, where medical ethics approval was obtained for data collection at UMMC. Following data acquisition, cleaning, and preprocessing, unstructured data underwent Bag of Words analysis through Natural Language Processing (NLP), while statistical analyses and correlation tests were executed on refined patient data. Feature selection, using Recursive Feature Elimination (RFE) technique, preceded the construction and training of three machine learning models: Logistic Regression, Decision Tree Classifier and Support Vector Machine. Logistic Regression performed the best (0.919 accuracy, 0.636 area under curve (AUC)). Advancing to the progressing phase, 443 data was expanded to 1578, with COVID-19 readmission rate of 8.68%. The dataset expansion prompted the re-computation of statistical analyses, feature selection, and machine learning processes. A total of six machine learning models were developed and trained, namely Logistic Regression, Decision Tree Classifier, Support Vector Machine, Random Forest, eXtreme Gradient Boosting and Category Boosting. Concurrently, six deep learning models were developed and trained after data balancing was executed, namely Multilayer Perceptron, TabNet, Value Imputation and Mask Estimation, TabTransformer, Deep Factorial Machine, and Regularization Learning Model. While machine learning performed better than deep learning, Logistic Regression stood out among the models (0.946 accuracy, 0.639 AUC). For analysis of readmitted patients, most patients had length of stay (LOS) of 7 days or less (76.64%), and majority returned to hospital within a 90-day-interval (70.8%), indicating a good recovery rate for COVID-19 in the observed population. In the finalizing phase, various feature selection techniques were employed to discern the risk factors for COVID-19 readmission. 7 clinical risk factors of COVID-19 readmission are finalized, namely heart rate, cough, age, LOS, diabetes mellitus, hyperparathyroidism, and asthma. Ultimately, a novel Slime Mold Algorithm (SMA) integrated hybrid predictive model was developed. By integrating SMA into Support Vector Machine (SVM), the predictive model achieved an accuracy of 0.946 and AUC of 0.734.
Item Type: | Thesis (PhD) |
---|---|
Additional Information: | Thesis (PhD) - Faculty of Engineering, Universiti Malaya, 2024. |
Uncontrolled Keywords: | COVID-19; Readmission; Prediction; Machine learning; Support vector machine (SVM) |
Subjects: | R Medicine > RA Public aspects of medicine T Technology > TA Engineering (General). Civil engineering (General) |
Divisions: | Faculty of Engineering |
Depositing User: | Mr Mohd Safri Tahir |
Date Deposited: | 12 Sep 2024 07:41 |
Last Modified: | 12 Sep 2024 07:41 |
URI: | http://studentsrepo.um.edu.my/id/eprint/15359 |
Actions (For repository staff only : Login required)
View Item |