Enhanced computational methods for detection and interpretation of heart disease based on ensemble learning and autoencoder framework / Abdallah Osama Hamdan Abdellatif

Abdalla Osama , Hamdan Abdellatif (2024) Enhanced computational methods for detection and interpretation of heart disease based on ensemble learning and autoencoder framework / Abdallah Osama Hamdan Abdellatif. PhD thesis, Universiti Malaya.

[img] PDF (The Candidate's Agreement)
Restricted to Repository staff only

Download (166Kb)
    [img] PDF (Thesis PhD)
    Download (1466Kb)

      Abstract

      Heart disease remains the primary cause of mortality globally, and its early detection is critical for reducing mortality rates. However, the challenge of class imbalance and high dimensionality in clinical data significantly impedes the efficacy of Machine Learning (ML) models in this domain. This thesis presents two innovative methods that holistically address these challenges at algorithmic and data levels to enhance heart disease detection. The first method introduces an Improved Weighted Random Forest (IWRF) approach, focusing on algorithmic innovation to tackle the imbalance problem. It employs supervised infinite feature selection (Inf-FSs) to identify significant features and Bayesian optimization for fine-tuning hyperparameters. Validated on Statlog and heart disease clinical records datasets, this method demonstrates a notable improvement in prediction accuracy and F-measure, outperforming existing models and marking an accuracy enhancement of 2.4% and 4.6% on these datasets. In contrast, the second method addresses the data-level imbalance through a novel framework named Conditional Autoencoder with Stack Predictor for Heart Disease (CAVE-SPFHD). This approach integrates a conditional variational autoencoder (CVAE) to effectively balance the dataset and a stack predictor (SPFHD) that utilizes tree-based ensemble learning algorithms. The base models' predictions are integrated using a support vector machine, significantly enhancing detection accuracy. Tested across four datasets, CAVE-SPFHD surpasses state-of-the-art methods in f1-score, providing improved not only predictive performance but also critical interpretative insights using the SHapley Additive explanation (SHAP) algorithm. Together, these two methods represent a comprehensive approach to heart disease detection in ML, effectively addressing the dual challenges of class imbalance and high dimensionality. By innovatively tackling these issues at both the algorithm and data levels, this thesis significantly contributes to the field, offering robust, accurate, and interpretable ML solutions for early heart disease detection, which is crucial for proactive healthcare interventions.

      Item Type: Thesis (PhD)
      Additional Information: Thesis (PhD) - Faculty of Engineering, Universiti Malaya, 2024.
      Uncontrolled Keywords: Heart disease; Conditional variational auto-encoder; Stacking ensemble learning; SHAP; Tree ensemble; Hyperparameter optimization; Feature selection; Imbalance
      Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering
      Divisions: Faculty of Engineering
      Depositing User: Mr Mohd Safri Tahir
      Date Deposited: 06 Nov 2024 05:55
      Last Modified: 06 Nov 2024 05:55
      URI: http://studentsrepo.um.edu.my/id/eprint/15481

      Actions (For repository staff only : Login required)

      View Item