Energy Efficiency Aware Task Assignment with DVFS in Heterogeneous Hadoop Clusters

Dazhao Cheng, Xiaobo Zhou, Palden Lama, M. Ji, Changjun Jiang

Research output: Contribution to journalArticlepeer-review

Abstract

While Hadoop ecosystems become increasingly important for practitioners of large-scale data analysis, they also incur tremendous energy cost. This trend is driving up the need for designing energy-efficient Hadoop clusters in order to reduce the operational costs and the carbon emission associated with its energy consumption. However, despite extensive studies of the problem, existing approaches for energy efficiency have not fully considered the heterogeneity of both workload and machine hardware found in production environments. In this paper, we find that heterogeneity-oblivious task assignment approaches are detrimental to both performance and energy efficiency of Hadoop clusters. Our observation shows that even heterogeneity-aware techniques that aim to reduce the job completion time do not guarantee a reduction in energy consumption of heterogeneous machines. We propose a heterogeneity-aware task assignment approach, E-Ant, that aims to improve the overall energy consumption in a heterogeneous Hadoop cluster without sacrificing job performance. It adaptively schedules heterogeneous workloads on energy-efficient machines, without a priori knowledge of the workload properties. E-Ant employs an ant colony optimization approach that generates task assignment solutions based on the feedback of each task's energy consumption reported by Hadoop TaskTrackers in an agile way. Furthermore, we integrate DVFS technique with E-Ant to further improve the energy efficiency of heterogeneous Hadoop clusters. It relies on a DVFS controller to dynamically scale the CPU frequency of each slave machine in response to time-varying resource demands. Experimental results on a heterogeneous cluster with varying hardware capabilities show that E-Ant with DVFS improves the overall energy savings for a synthetic workload from Microsoft by 23 and 17 percent compared to Fair Scheduler and Tarazu, respectively.
Original languageEnglish
Pages (from-to)70-82
Number of pages13
JournalIEEE Transactions on Parallel and Distributed Systems
Volume29
Issue number1
Early online date29 Aug 2017
DOIs
Publication statusPublished - 1 Jan 2018

Keywords

  • ant colony optimisation
  • cost reduction
  • data analysis
  • data handling
  • energy conservation
  • energy consumption
  • parallel processing
  • power aware computing
  • scheduling
  • machine hardware
  • heterogeneity-oblivious task assignment approaches
  • heterogeneity-aware techniques
  • heterogeneous machines
  • heterogeneity-aware task assignment approach
  • heterogeneous Hadoop cluster
  • energy-efficient machines
  • ant colony optimization approach
  • task assignment solutions
  • Hadoop TaskTrackers
  • heterogeneous cluster
  • energy savings
  • energy efficiency aware task assignment
  • Hadoop ecosystems
  • energy-efficient Hadoop clusters
  • energy cost
  • operational cost reduction
  • carbon emission reduction
  • large-scale data analysis
  • production environments
  • job completion time reduction
  • E-Ant approach
  • job performance
  • adaptive heterogeneous workload scheduling
  • DVFS technique
  • time-varying resource demands
  • hardware capabilities
  • Microsoft
  • Fair Scheduler
  • Tarazu
  • Energy consumption
  • Hardware
  • Servers
  • Benchmark testing
  • Power demand
  • Electronic mail
  • Ant colony optimization
  • Energy efficiency
  • task assignment
  • DVFS
  • fuzzy control
  • heterogeneity
  • hadoop
  • ant colony optimization

Fingerprint Dive into the research topics of 'Energy Efficiency Aware Task Assignment with DVFS in Heterogeneous Hadoop Clusters'. Together they form a unique fingerprint.

Cite this