Available online at www.sciencedirect.com
ScienceDirect
IERI Procedia 10 (2014) 70 – 75
20014 Internattional Confe
ference on Future
F
Inform
mation Enggineering
An Optiimal Tassk Selecttion Schheme for Hadoopp Scheduuling
S. Suuresha*, N.P
P. Gopalan
nb
Deparrtment of computeer Applications ,N
National Institutee of Technology,
Ti
Tiruchirappalli
6332 015,India
Absttract
MapR
Reduce is a popular
p
parallel programmingg model used to solve widee range of BiggData applications in cloud
compputing environm
ment. Hadoop is an open sourcce implementattion MapReduce and widely used by vast amount of users.
It proovides an abstrracted environm
ment for runninng large scale data
d
intensive applications
a
in a scalable and fault tolerant
mannner. There are several
s
Hadoopp scheduling alggorithms are prooposed in the liiterature with vaarious performaance goals. In
this paper,
p
a new optimal
o
task seelection schemee is introduced in to assist th
he scheduler whhen multiple loocal tasks are
availlable for a nodee. To improve the
t probability of percentage of local tasks launched for a job
j in future, thhe task which
has least number off replicas of inpuut, individual looad of disks atttached to the no
ode and maximuum expected tim
me to wait for
oposed methodd was evaluatedd by extensive
next local node is laaunched amongg the available local tasks for a node. The pro
m
improvves the perform
mance significanntly. From the experiments,
experiments and it has been obseerved that the method
o locality and fairness.
f
arounnd 20% of imprrovements achieved in terms of
© 2014
Authors.
by Elsevier
This isB.V.
The
AuthPublished
ors. Published
d by B.V.
Elsevier
Ban open access article under the CC BY-NC-ND license
©
20014The
(http://creativecommons.org/licenses/by-nc-nd/3.0/).
Seleection and peer review undeer responsibilitty of Informattion Engineering Research Institute
Selection and peer review under responsibility of Information Engineering Research Institute
Keyw
words:Hadoop,MaapReduce,task selection, job schedduling.
1. In
ntroduction
A recent rapidd advancemennt in computinng systems, it has been appllied to solve very
As
v
complex problems in
* Corresponding author.
a
Tel.: +91-99941506562.
E
E-mail
address: suureshtvmalai85@
@gmail.com
2212-6678 © 2014 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/3.0/).
Selection and peer review under responsibility of Information Engineering Research Institute
doi:10.1016/j.ieri.2014.09.093
S. Suresh and N.P. Gopalan / IERI Procedia 10 (2014) 70 – 75
diverse fields. The data generated and processed by such applications are extremely very large in size. Several
computing paradigms such as grid computing, cloud computing are emerged to solve large scale data
intensive applications efficiently. However, traditional programming models, load balancing, debugging and
scheduling solutions are not efficient when dealing BigData applications. Redefining the solutions to the
aforementioned problems of large scale data intensive applications attracted much of research audience.
Several frameworks are proposed in the literature to efficiently handling BigData applications [1-4].
MapReduce [1] is a popular programming model for data intensive applications in cloud environments. It
hides all the parallel programming difficulties from the users and provides abstracted environment. Hadoop
[5] is a widely used open source implementation of MapReduce. Being capable of providing fault tolerance,
load balancing and scalable service over commodity hardware makes Hadoop as a standard for BigData
processing. The performance of Hadoop is mainly based on scheduler. Several scheduling algorithms [6-10]
are proposed for different kind of workloads, environments and applications.
In this paper, a new optimal task selection scheme is introduced in to assist the scheduler when multiple
local tasks are available for a node. To improve the probability of percentage of local tasks launched for a job
in future, the task which has least number of replicas of input, individual load of disks attached to the node
and maximum expected time to wait for next local node is launched among the available local tasks for a node.
The proposed method was evaluated by extensive experiments and it has been observed that the method
improves the performance significantly. From the experiments, around 20% of improvements achieved in
terms of locality and fairness. The remainder of the paper is organized as follows. Section 2 explores the
related works on Hadoop scheduling. The proposed method is presented in section 3. The performance of the
proposed method is evaluated in section 4 and section 5 concludes the paper.
2. Related Works
Hadoop uses master slave architecture for running applications concurrently. The master and slave nodes
are communicated by sending heartbeat messages. Wherever a heartbeat received form an empty node,
Hadoop scheduler starts launching a task to that node. Hadoop uses default FIFO scheduler for job
scheduling. FIFO lacks in performance in terms of response time of small jobs and fair sharing among users.
Several scheduling algorithms [6-10] are proposed for different kind of workloads, environments and
applications. Yahoo proposed Capacity Scheduler [6] to share the Hadoop cluster among various
organizations. A set of queues with minimum guaranteed capacity is created and shared the cluster
accordingly.
Facebook proposed Hadoop Fair Scheduler (HFS) [7] to share the cluster fairly among users/applications.
Set of pools are created and a minimum share is assigned to each. Zaharia et al. proposed LATE [8] and Delay
scheduling [9] for efficiently launching speculative task and achieving improved locality by slightly relaxing
fairness respectively. The deadline constraints of job are considered in [10]. The readers interested in further
Hadoop scheduling techniques can refer a detailed survey in [11]. Although several algorithms proposed in
the literature, none of them considered the problem of selecting optimal task from set available local tasks for
a node.
3. Proposed Method
Hadoop scheduler chooses a job to assign a slot according to the scheduling policy whenever a node
becomes free. After deciding the job, the scheduler starts searching to find a local task from list of
unscheduled tasks of the job to launch on free node. If a local task found, scheduler launching the task
immediately. If no local task is available, scheduler starts launching non local task or chance is given to next
71
72
S. Suresh and N.P. Gopalan / IERI Procedia 10 (2014) 70 – 75
job in the queue based on scheduling policy. By default, Hadoop lunches the first local task found among the
available local tasks of the job. Suppose there are many local tasks available for the job, benefit of choosing a
task among them compare to another is not considered. Carefully choosing a task from set of available local
tasks based on various factors may improve the locality of other tasks in future and may achieves improved
load balancing.
By default, Hadoop replicates each input blocks into 3 different nodes. Several replication algorithms are
creating and deleting the files/blocks dynamically according to workload characteristics and balancing the
load among the nodes. Hadoop uses Hadoop Distributed File System (HDFS) for storing input and output data
which divides the files into fixed sized blocks. This will result a situation where the input of an application
(which may consists of many files) is stored with different replica count. Usually the number of nodes is less
than the number blocks and each blocks are replicated into many places (either static or dynamic replication),
many input blocks of the job are stored in the same node. In addition, many files are supplied as an input of an
application which further increases the chance of number of local tasks of the job to a node.
To choose an optimal task with the goal of highest possible locality of other tasks in mind, the knowledge
of future arrival of preferred nodes free slots is to be known. In addition, launching a task whose input is
stored on least loaded disk among the disks attached to the node also achieves high transfer rate and yields
better load balancing at disk level. The nodes future arrival times are unknown before their arrival. To predict
the future arrival time of nodes for tasks, progress rates of the nodes is used. The following things are
considered while selecting a task: the replica count of input file/block, predicted arrival time of task’s next
free node and the current load of the disk where the task’s input is stored.
The future arrival time of a node with data locality for a task is estimated and used for selecting task as
similar to the optimal page replacement algorithm in operating systems. Let T = {t1, t2, ... ti} be the set of
unscheduled tasks of job Ji have their input on node ni (set of local tasks to node ni). Li is the average tasks
length running on the slots where task ti’s input available. The Rt is the replication factor of ti’s input and Sik is
the number of slot available in the kth local node of ti. The task’s expected time for next free slot is calculated
as follows:
(1)
ET(ti) = ೃ
σೖసభ ௌೖ The load of a disk attached to a node is equal to the sum of I/O request from the tasks running on the node
and some of I/O request from non local tasks running on other nodes accessing the disk through network. To
distribute the load across the disks attached to a node, choosing task whose input is stored on disk in which
least number running tasks’ inputs stored is desirable. For simplicity, the non local tasks are not considered
for calculating disk load. The disk load on which task’s input is stored calculated as follows:
DL(ti) = 1 -
ே
ே௦
(2)
where Ns is the number of slots in the node and Ni is the number of running tasks reads input from disk
where the input of task ti is stored.
Similarly, the replication weightage based on replication count is calculated as follows:
RW(ti) = 1 -
ோ௧ି୫୧୬ሺோ௧ሻ
୫ୟ୶ሺோ௧ሻି୫୧୬ሺோ௧ሻ
(3)
73
S. Suresh and N.P. Gopalan / IERI Procedia 10 (2014) 70 – 75
where max(Rt) and min(Rt) are the maximum and minimum number of replicas of tasks input blocks
respectively. The more number of replica count gives small value of RW(ti) and least number of replica count
gives higher value of RW(ti).
The values of DL(ti) and RW(ti) are ranging from zero to one. The value of ET(ti) is normalized between
zero and one. Finally, for task selection, a multi objective function as the linear combination of
aforementioned three objective functions is formalized as follows:
TS(T) = max {Į × ET(ti) + ȕ × DL(ti) + Ȗ RW(ti)}, 1 i n
(4)
ே
ோ௧ି୫୧୬ሺோ௧ሻ
TS(T) = max{Į × ( ೃ
) + ȕ × (1 ) + Ȗ × [1 ]}, 1 I n (5)
σೖసభ ௌೖ ே௦
୫ୟ୶ሺோ௧ሻି୫୧୬ሺோ௧ሻ
The symbols Į, ȕ and Ȗ are proportion variables corresponding objective functions and n is the number of
local tasks available for a node from the job which could be scheduled next according to scheduling policy.
The performance overhead of proposed method is trivial. Because, the number of tasks of a job which are
local to a node is not a larger value even the cluster size is large. Also, calculating the value of TS(T) for small
n value takes negligible amount of time. In addition, the proposed method is used with any Hadoop
scheduling algorithm and any systems beyond Hadoop which has several scheduling options. The objective
function is defined as a linear combination of several objectives which gives a flexibility to add or remove
objectives easily based on our requirements. The values of proportion variables Į, ȕ and Ȗ are fixed according
to our needs and their sum could be one. The higher value means more weightage for that objective function.
4. Performance Evaluation
The performance of the proposed method is evaluated by extensive experiments. The simulation
experiments were run on 20 node cluster. Each node has 2.4GHz quad core processor and 4GB of RAM. All
nodes are having three 500GB sata hard disk drives of 7200RPM and nodes are connected with 1 Gbps
Ethernet. The WordCount job is used for experiments and input data generated using RandomTextWriter
program in Hadoop distribution of 1.0.4 version. There are 100 jobs are generated with random input sizes
with the inter-arrival time of 15 seconds. The values of Į, ȕ and Ȗ are fixed 0.4, 0.3 and 0.3 respectively
throughout the experiments. All experiments are repeated 5 times and average values of results obtained is
presented.
The Hadoop Fair Scheduler (HFS) [7] is used as a scheduler in the experiments. The performance of
proposed method is measured by the improvements achieved in data locality. Figure 1 shows the percentage
of local tasks launched with HFS and HFS plus proposed task selection method for various job sizes. Small
jobs (1 to 2 tasks) are achieved lower percentage of locality in HFS as input is available in few nodes. Also,
using the proposed method with HFS does not improve the locality significantly for small jobs. Because, the
proposed method is not applied for single task jobs and jobs with 2-4 tasks has very low probability for more
than one tasks input blocks are stored on the same node. The performance of proposed method for large jobs
(jobs with more than 50 tasks) are also achieves negligible improvements over HFS. Large jobs are usually
having the input data in larger number of node, hence leads to highest locality by default. On the other hand,
the proposed method improves the locality for medium size jobs (5 to 50 map tasks) significantly. Around 20%
of improvement is achieved in launching local nodes by the proposed method over the HFS.
Delay scheduling [9] proposed by Zaharia et al. improves the data locality by relaxing fairness of the
resource allocation. Users can achieve desirable locality level by setting the delay threshold accordingly. The
performance of the proposed scheme is also tested using HFS with delay scheduling. The results obtained in
74
S. Suresh and N.P. Gopalan / IERI Procedia 10 (2014) 70 – 75
terms of locality are not presented here because of negligible performance improvement. On the other hand, a
considerable improvement is achieved in fairness of resource allocation among jobs (Fig. 2). The proposed
method increases the chance of task locality which reduces the fairness relaxation of delay scheduling
technique to achieve desired locality. So, the proposed method improves the performance in terms of fairness
in the case where delay scheduling is used for higher locality. Improvement in locality and fairness also yields
better performance in jobs response time and load balancing. In future, analyzing the performance of the
proposed algorithm with various scheduling algorithms, cluster sizes and workloads is planned. Also several
other factors which are influencing the system performance will be considered for task selection.
PercentLocalMaps
HFS
HSF+Proposed
100
80
60
40
20
0
1Ͳ 4
5Ͳ 20 21Ͳ 50 51Ͳ 150 151Ͳ
300
JobSize(NumberofMapTasks)
Fig. 1. Percentage of local tasks launched for various job sizes
Fig. 2. Fairness improvements of proposed method over delay scheduling
>350
S. Suresh and N.P. Gopalan / IERI Procedia 10 (2014) 70 – 75
5. Conclusion
In this paper, an optimal task selection method is proposed in Hadoop environments. A multi objective
function based on various objectives, such as the replica count of input file/block, estimated arrival time of
tasks next free node and the current load of the disk, is presented to select a task from set of available local
tasks of the job. The performance of the proposed method is evaluated by extensive experiments and observed
around 20% of improvements in terms of locality over HFS. Further, it also achieves significant improvement
in fairness without affecting locality used with delay scheduling. If future, several other factors which are
influencing the system performance will be considered for task selection.
References
[1] VJeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified data processing on large clusters.
Communications of the ACM, 51(1):107–113, 2008.
[2] Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. Dryad: distributed dataparallel programs from sequential building blocks. ACM SIGOPS Operating Systems Review, 41(3):59–72,
2007.
[3] Jaliya Ekanayake, Hui Li, Bingjing Zhang, Thilina Gunarathne, Seung-Hee Bae, Judy Qiu, and Geoffrey
Fox. Twister: a runtime for iterative MapReduce. In Proceedings of the 19th ACM International Symposium
on High Performance Distributed Computing, 810–818, 2010.
[4] Yunhong Gu and Robert L Grossman. Sector and sphere: the design and implementation of a highperformance data cloud. Philosophical Transactions of the Royal Society A: Mathematical, Physical and
Engineering Sciences, 367(1897):2429–2445, 2009.
[5] Apache Hadoop. http://hadoop.apache.org/.
[6] Hadoop’s Capacity Scheduler: http://hadoop.apache.org/docs/stable/capacity_scheduler.html
[7] Hadoop’s Fair Scheduler: http://hadoop.apache.org/docs/stable/fair_scheduler.html
[8] M. Zaharia, A. Konwinski, A. D. Joseph, R. H. Katz, and I. Stoica. Improving MapReduce performance in
heterogeneous environments. In Proceedings of the 8th USENIX Symposium on Operating Systems Design
and Implementation (OSDI), 2008.
[9] M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. Delay scheduling: a
simple technique for achieving locality and fairness in cluster scheduling. In Proceedings of the 5th European
Conference on Computer systems (EuroSys), 2010.
[10] K. Kc and K. Anyanwu, "Scheduling Hadoop Jobs to Meet Deadlines", in Proc. CloudCom,388-392,
2010.
[11] B.Thirmala Rao, LSS.Reddy. Survey on Improved Scheduling in Hadoop MapReduce in Cloud
Environments. International Journal of Computer Applications,34(9):29-33,2011.
75
© Copyright 2026 Paperzz