Adaptive Mapreduce task scheduler in heterogeneous environment using dynamic calibration / Lu Xinzhu

Lu , Xinzhu (2017) Adaptive Mapreduce task scheduler in heterogeneous environment using dynamic calibration / Lu Xinzhu. Masters thesis, Universiti Malaya.

[img] PDF (The Candidate's Agreement)
Restricted to Repository staff only

Download (232Kb)
    [img] PDF (Thesis M.A.)
    Download (1066Kb)


      MapReduce is a popular programming model for processing large-scale datasets in a distributed environment. Currently, the MapReduce implementation is based on the assumption that every compute node has the same capacity. However, in a heterogeneous environment, such assumptions may hinder the MapReduce performance where compute nodes are of varying capacity. Current works showed that make-span could be reduced if workloads are assigned in proportion to the capacity of the heterogeneous compute node. However, these approaches are static in nature where work load is assigned to each compute node based on historical data. This research is an attempt to propose an adaptive MapReduce Task scheduler, namely Adaptive MapReduce Task Scheduler Using Dynamic Calibration (AMTS-DC) to address the unbalanced node capacity problem. The proposed AMTS-DC algorithm uses the heartbeat and data locality to dynamically adapt and balance tasks assigned to each compute node. Based on the heartbeats received during early stage of the job, AMTS-DC is able to estimate the capacity of each compute node. After that, uncomputed local blocks at each compute node are reassigned so that compute nodes with greater capacity are able to reserve more local blocks. Experiment results show that AMTS-DC have relatively better performance when compare to Hadoop FIFO and Dynamic Data Placement Strategy (DDP) in dynamic heterogeneous environment. AMTS-DC has been further enhanced with the introduction of historical data and the enhanced version is named Enhanced Adaptive MapReduce Task Scheduler using Dynamic Calibration (EAMTS-DC). Experimental results show that EAMTS-DC performs better than AMTS-DC.

      Item Type: Thesis (Masters)
      Additional Information: Dissertation (M.A.) – Faculty of Computer Science & Information Technology, Universiti Malaya, 2017.
      Uncontrolled Keywords: MapReduce; heterogeneous environment; AMTS-DC; Hadoop FIFO; Historical data
      Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
      Divisions: Faculty of Computer Science & Information Technology
      Depositing User: Mr Mohd Safri Tahir
      Date Deposited: 12 Apr 2023 04:21
      Last Modified: 12 Apr 2023 04:21

      Actions (For repository staff only : Login required)

      View Item