A Novel Contention-Cost Scheduling Algorithm in Internet of Things

A Novel Contention-Cost Scheduling Algorithm in
Internet of Things Environment
Pham Phuoc Hung1, Mohammad Aazam2, Eui-Nam Huh1
1Department
of Computer Engineering, Kyung Hee University
Yongin-si, South Korea
1{hungpham, johnhuh}@khu.ac.kr, [email protected]
Abstract. Internet of Things (IoT) is more and more popular in the era of computing and networking. Pervasive computing and ubiquitous services have become an important part of today’s life. Furthermore, mobile devices and their
utility have also been spreading very fast, which all has resulted in generating
huge amount of digital content. It is not going to be possible for standalone
IoTs to handle the data and perform efficient task scheduling. Therefore, integration of IoTs with cloud computing has become an urgent demand. In this paper, we have presented a contention-cost task scheduling mechanism for IoT
services in the cloud. We implemented our model in Java and evaluated in
CloudSim. Results and discussion show the effective performance of our model.
Keywords: Parallel computing, IoT, big data, cloud, sensor.
1 Introduction
Internet of Things (IoT) is a technological revolution that represents the future of
connectivity and reachability. In IoT, ‘things’ represents any object from the physical
world, whether it is a communicating device or a non-communicating dumb object.
From a smart device to a leaf of a tree or a bottle of beverage, anything can be part of
Internet. The objects become communicating nodes over the Internet, through data
communication means, primarily Radio Frequency Identification (RFID) tags. IoT
includes smart objects as well. Smart objects are digital objects and perform some
kind of tasks for humans and the environment. This is why, IoT is not only hardware
and software paradigm, but also includes interaction as well as social aspects [3]. IoT
works on the basis of Machine-to-Machine (M2M) communication, but not limited to
it. M2M refers to communication between two machines without human intervention.
In IoT, even non-connected entities can also become part of IoT with a data communicating device, like a bar-code or an RFID tag, sensed through a device (may
even be a smart phone sensing it), which eventually is connected to the Internet. In
IoT, non-intelligent objects, known as ‘things’ become the communicating nodes.
IoT-based services are gaining importance rapidly. A lot of healthcare, pervasive
computing, and ubiquitous services are being developed. Other than that, gadgets like
Google Glass, Smart Watch, Google Gear, etc. and with the increasing mobile devices
and smart phones, a lot of data is generated, which makes it difficult to satisfy different quality of service (QoS) requirements and achieve rapid services composition and
deployment. That requires a better mechanism to manage objects in IoT more efficient with power and bandwidth constrains task without any compromising on performance [4].
One of the ways to address those shortcomings is by applying the state-of-the-art
Cloud Computing (CC) paradigm. CC has recently become a rising paradigm in the
information and communications technology industry, drawing a lot of attentions to
professionals and researchers [18]. The main idea is to move, or offload, heavy data
processing and storage to powerful, centralized server computers resided in data centers [2, 17]. CC brings the power to IoT environment and helps to deal with obstacles
related to the performance (e.g., battery life, storage, and bandwidth), environment
(e.g., heterogeneity, scalability, and availability), and security (e.g., reliability and
privacy) discussed in mobile computing [1]. Hence, it can provide solution package
for the media revolution, or big data, if accordingly designed for IoTs and integrated
with the advanced technologies on data processing, transmission, and storage.
The applications of cloud computing (CC) in an IoT environment include a new
computing paradigm named Cloud of Things (CoT) [6] [7]. However, one of the main
issues still is that an IoT produces multimedia data such as based Visual Sensor Network or CCTV which connected to cloud. The reason is that multimedia content consumes more processing power, storage space, and scheduling resources. Therefore, it
will be very significant to manage them effectively and perform an efficient mechanism like task scheduling in the cloud. Task scheduling has become a hot topic in
cloud arena, especially for Cloud of Things. There are numerous studies [9, 13, 15]
attempt to solve task scheduling problems but none of them is completely efficient
because they have not yet considered users’ cost and network connectivity. This paper
presents a proposal for scheduling application in CoT to improve Quality of Service
(QoS) requirement. This approach is very substantial and necessary to support services in a big data environment. Additionally, the method aims at reducing the processing time in cloud services while still considering network contention and customer payment. Theoretical and simulation result shows that the proposed scheme can
achieve advantages over others in some cases.
The rest of our paper has the following organization: Recent relevant problems are
visited in section 2; Section 3 reveals the motivation behind the work, with a detailed
scenario; Section 4 presents system architecture. Section 5 formulates the main problems and their proposed solutions; Section 6 is focused on the implementation and
performance evaluation. Section 7 concludes the paper and discusses possible future
work.
2 Related Studies
There have been various studies that attempt to solve task scheduling problems in
homogeneous and heterogeneous systems, where the sequence of the tasks (workflow) is popularly presented by a directed acyclic graph (DAG) as shown in Fig. 3. In
[12], the authors propose a task scheduling approach for assigning processors to task
graph templates prepared in advance. The limitation of this method is that it does not
consider network contention. Sinnen et al. [13] present an efficient task scheduling
method based on network contention, however the method does not take into account
the monetary cost paid by cloud service customers (CSCs) for use of cloud resources.
Ruben V. den Bossche et al. in [9] introduce a cost-efficient approach to select the
most appropriate system to execute the workflow according to a deadline constraint as
well as cost savings. Nevertheless, this work does not consider the tradeoff between
the time and the cost. Lingfang et al. in [14] propose budget conscious scheduling
algorithms to satisfy strictly the budget constrain. In the meantime, J. Li et al. in [15]
present a scheduling algorithm to schedule the application of large graph processing
considering both cost and schedule length. Lately, the authors in [10, 11] suggest an
integration of CC and wireless sensor network. In this approach, patient’s vital data
collection by sensors attached to medical equipment can be uploaded and processed
on cloud. However, they do not concern how to schedule application to minimize the
processing time as well as the monetary cost paid by cloud service customers. As
much as we studied, no scheduling approach available already has concerned both
network contention and the cloud cost as well as tradeoff between them is conducted,
especially in CoT environment. Thereby, in this paper, we introduce a novel method
to solve the above shortcoming.
3 Motivating Scenario
In this section, we discuss the importance and applicability of our work, by presenting relevant scenario. In CoT environment as described in Fig. 1, many devices
can be connected through numerous types of sensors of “things” to monitor and gather various physical or environmental information (e.g., temperature, sound, pressure
from human activity recognize, sensor activity recognize, image processing, health
monitoring, …) for detecting and signaling a changing condition.
Fig. 1. Motivating scenario.
This sensed data (sensor, video, or audio data) can be used as input data and uploaded to the computing environment, which consists of virtual machines (VMs).
Every day, a huge amount of data is transferred back to data centers, billions of records are generated while user demands increase with higher satisfaction. That requires
a very sophisticated scheduling methodology which is able to process the large data
parallel to guarantee QoS requirements, minimum processing time, reduce cost and
increase user satisfaction. Consequently, our paper tries to deal with the following
issues: Scheduling tasks to minimize the execution time of the cloud system while
considering the contention and the cloud cost.
4. System Architecture
The following section gives an insight of our system architecture proposed to address issues discussed in the above scenario.
Fig. 2. System architecture.
Our architecture has three layers, as illustrated in Fig. 2, including (1) Cloud Provider layer, which contains Micro Data Center Machines (VMs) or Mega Data Center,
(2) Cloud Customer layer, where “things” of CoT reside and be able to receive re-
quest via sensors, and (3) broker layer. In the broker layer, there is a VM m functioning as a centralized management node, called Task Scheduler. The Task Scheduler
receives all computation requests of users, controls Scheduling Agent to determine
received data will be added in a service queue or not; asks Information Collector to
manage VM’s profiles (processing capacity, network bandwidth) as well as computation costs together with results of data query returned from VMs; and accordingly
creates the most reasonable schedule for an input workflow.
In the next section, we formulate the problem and describe our proposed approach.
5. Problem Formulation and Solution
In this section, we first define the terms used and then formulate the problem.
Task scheduling on a target system is defined as the problem of allocating the tasks of
an application to a set of processors in order to minimize total execution time. Thus,
the input of task scheduling includes a task graph and a process graph. The output is a
schedule representing the assignment of a processor to each task node.
Definition 1. A task graph, e.g. as in Fig. 3(a), is represented by a DAG, G =(V, E,
w, c), where the set of vertices V ={v1,v2,...,vk} represents the set of parallel subtasks,
and the directed edge eij = (vi,vj )∈E describes the communication between subtasks vi
and vj, w(vi) associated with task vi ∈V represents its computation time and c(eij) represents the communication time between task vi and task vj with corresponding transferred data d(eij). We presume that a task vi without any predecessors, prec(vi)=0, is an
entry task ventry, and a task that does not have any successors, succ(vi) =0, is an end
task vend . The task consists of workload wli, which delimits the amount of work processed with the computing resources. Besides, it also contains a set of preceding subtasks prec(vi) and a set of successive subtasks succ(vi) of task vi, ts(vi,Pj) denotes Start
Time and w(vi, Pj) refers to the Execution Time of task vi∈V on processor Pj. Hence,
the finish time of that task is given by tf(vi, Pj)= ts(vi, Pj)+ w(vi, Pj).
(a)
(b)
Fig. 3. A sample DAG and a processor graph
Suppose that the following conditions are satisfied:
Condition 1. A task cannot begin its execution until all of its inputs have been
gathered sufficiently. Each task appears only once in the schedule.
Condition 2. The ready time tready(vi, Pj) is the time that processor Pj completes its
last assigned task and be ready to execute task vi. Therefore,
tready (vi , Pj )  max{ max (t f (v y , Pj )),
v y exec ( j )
max
ezi E ,vz  prec ( vi )
(t f (ezi ))},
(1)
where exec(j) is a set of tasks executed at processor Pj, tf(ezi) = tf(vz)+c(ezi).
Condition 3. Let [tA,tB]∈[0,∞] be an idle time interval on processor Pj in which no
task is executed. A free task vi ∈ V can be scheduled on processor Pj within [tA,tB] if
max{t A , tready (vi , Pj )}  w(vi , Pj )  tB .
(2)
Definition 2. A processor graph TG=(N,D) demonstrated in Fig. 3(b) is a graph
that describes the topology of a network between vertices (processors) that are cloud
VMs. In this model, N is the finite set of vertices, and a directed edge dij ∈ D denotes
a directed link from vertex Pi to vertex Pj with Pi,Pj∈N. Each processor Pi controls the
processing rate pi and bandwidth bwi on the link connecting it to other processors.
Proposed Approach
This section shows some assumptions for the proposed method. Given a task graph G
= (V, E, w, c) and the processor graph TG=(N,D), our approach has two steps:
 Determining the task priority
In this step, each task is set a priority based on the upward rank value of this task
in the task graph. Here, a priority of a task vi is estimated by the length of the critical
path leaving the task. Recursively defined, the priority value pr of a task vi is as:
 w(vi )  max [c(eij )  pr (v j )] vi  vend

v j succ ( vi )
pr (vi )  

w(vi )
vi  vend

(3)
where w(vi ) is the average execution time of task vi and c(eij ) is the average communication time between task vi and vj, correspondingly:
w(vi ) 
wli
(  pk ) / n
c(eij ) 
,
nk N
d (eij )
(4)


  bwk  / n
 nk N

with n is the number of processors in the cloud environment.
Finally, we sort all tasks with a descending order of pr, which is the length of remained schedule, whose benefit is to provide a topological order of the tasks. . In the
next part, we are going to propose an approach to select the most appropriate virtual
machine to execute the above priority based tasks.
 Choosing the most appropriate processor to execute the above tasks
The start time of a task is defined when the last preceding task is completed.
Thence, to determine that start time, the earliest idle interval [tA,tB] on processor Pj
has to be searched and found to satisfy condition 2 and condition 3. As a result, the
start time ts of task vi on processor Pj is set as:
t s ( vi , Pj )


max( t A ,tready ( vi , Pj )), if vi  ventry
0,
otherwise
,
(5)
Thus, Earliest Start Time (EST) of a task vi executed on a processors Pj is computed as follows:
(6)
EST (vi , Pj )  max (t f (vz , Pk ))  max(c(eikj )),
vz  prec ( vi ), Pk N
Pk N
where c(eikj ) is the communication time between processors Pk and Pj defined as:
c(eikj )  (diik 

vz ( prec ( vi ) exec ( k ))
dozi )*(
1
1

).
bw j bwk
(7)
Here, diik is the amount of input data stored at processor Pk and used for executing
task vi and dozi is amount of outgoing data executed from Pk then transferred to Pj.
Therefore, Earliest Finish Time (EFT) of the task vi is calculated as:
(8)
EFT (vi , Pj )  w(vi , Pj )  EST (vi , Pj ).
In addition, the algorithm also considers the cost paid by cloud customers to execute the tasks. The cost C(vi,Pj) for task vi executed at a VM Pj is defined by:
(9)
( vi , Pj )
( vi , Pj )
( vi , Pj )
C (vi , Pj )  C proc  Cwait  Ccomm
In equation (10), each cost is calculated as follows:
Cost of processing is expressed as:
i
j
C proc
 c1 * wli / p j ,
(v ,P )
(10)
where c1 is the processing cost per time unit on processor Pj with processing rate pj .
Let tmin be the finish time of the task which is completed first out of the parallel tasks
and there is no available task after this one, c2 be the waiting cost per time unit and ti
be the finish time of the task vi. Then the cost of waiting time is as:
Cwaiti j  c2 *(ti  tmin ).
(v ,P )
(11)
Suppose that the amount of money per time unit for transferring outgoing data
from processor Pj is c3, then the cost of communication time is defined as follows:
i j
Ccomm
 c3 *(dii j 
(v ,P )

vz ( prec ( vi )exec ( j ))
dozi ) / bw j .
(12)
Using these costs, we can calculate a utility function that computes the tradeoff
U(vi,Pj) between the cost and EFT to determine the most appropriate processor, which
is the one obtains the minimum value of this tradeoff, as follows:
U (vi , Pj )  Min

vi E , Pk N
(
C (vi , Pj )
EFT (vi , Pj )
*
).
Max[C (vi , Pk )] Max[ EFT (vi , Pk )]
(13)
6 Implementation and Analysis
This section presents our experiments via numerical simulations to evaluate the
efficiency of our approach, the Contention-Cost aware Scheduling algorithm (CCaS),
and compare its performance with two others: Contention aware Scheduling (CaS)
[15], which just takes account of network contention, and Greedy for Cost (GfC),
which merely concerns the monetary cost. All the parameters are different task graphs
G=(V, E, w, c) with the increase of the matrix size from 10 to 90 and heterogeneous
computing node graphs TG=(N,D) which is a combination between 30 MDCs with
the different configurations for the above algorithms as shown in Table 1. We developed the simulations in Java with JDK-7u7-i586 and Netbeans-7.2 using
CloudSim[16]. It is a framework for modeling and simulation of cloud computing
infrastructures and services. In our simulation, we denote MIPS as Million Instructions per Second to represent the processing capacity of MDCs.
Table 1. Characteristics of the target system
Parameter
Topology model
Operating system
Number of processors
Number of tasks
Processing rate
Bandwidth
Cost per a time unit executed on processor Pj
Cost per outgoing data unit from processor Pj
Value
LAN, fully connected
Windows 7 professional
[5, 30]
[20, 90]
[10, 750]
[10, 100, 512, 1024] Mbps
[0.1, 0.5]
[0.2, 0.6]
The following figures illustrate obvious differences between simulated scenarios. Fig. 4 shows that GfC get the worst case, CaS obtains the best result in term of
schedule length while our approach is still in the middle. Specifically, our method is
17% better than GfC. However, regarding the monetary cost paid by CCs (as illustrated in Fig. 5), it has been observed that although CaS provides the best performance, it
has the highest cost while the opposite is true for GfC. In the meantime, our solution
is balanced between schedule length and cloud cost. Compared with CaS, our method
can save nearly 21% cost for CCs.
Fig. 4. Schedule length comparison
Fig. 6. Schedule length with numbers of processors
Fig. 5. Cost comparison
Fig. 7. Cost with numbers of processors
We next measured the effect of increasing number of processors on the cloud cost
and the schedule length only in CCaS with a fixed number of tasks. The results reflected in Fig. 6 and Fig. 7 indicate that more processors result in better system performance but higher cost. It is highly noticeable to find that the cost goes up from
300500 G$ to 325000 G$ as the number of processors increases from 15 to 20.
7 Conclusion
This paper proposes a new architecture that aims at utilizing cloud resources for
IoT services in cloud platform. Furthermore, we presented a novel method to improve
the task scheduling so as to bring desired processing time while balancing the network
contention and cloud service cost. Besides, we conducted simulations to evaluate our
approach. Through the implementation, it has been seen that our solution is more cost
effective and achieves better performance than other existing approaches when compared with. We will soon extend the proposed model to run in various circumstances
to achieve higher reliability and efficiency with maximum satisfaction.
Acknowledgements This research was supported by Next-Generation
Information Computing Development Program through the National Research
Foundation of Korea(NRF) funded by the Ministry of Science, ICT & Future
Planning (2010-0020725). The corresponding author is professor Eui-Nam Huh.
References
1. Dinh, H. T., Lee, C., Niyato, D. and Wang, P. (2011), “A survey of mobile cloud computing:
architecture, applications, and approaches”,Wirel. Commun. Mob. Comput..doi:
10.1002/wcm.1203, http://onlinelibrary.wiley.com/doi/10.1002/wcm.1203/.
2. Pham Phuoc Hung, Tuan-Anh Bui, Mauricio Alejandro Gmez Morales, Mui Van Nguyen,
Eui-Nam Huh, ”Optimal collaboration of thin thick clients and resource allocation in cloud
computing”, Personal and Ubiquitous Computing, 2013.
3. Gerd Kortuem, Fahim Kawsar, Daniel Fitton, and Vasughi Sundramoorthi: Smart Objects
and Building Blocks of Internet of Things. IEEE Internet Computing Journal. volume 14, issue 1, (2010), pp. 44-51.
4. Vana Jelicic, Michele Magno, Davide Brunelli, Vedran Bilas, and Luca Benini: Benefits of
Wake-Up Radio in Energy-Efficient Multimodal Surveillance Wireless Sensor Network.
IEEE Sensors Journal. vol. 14, No. 9, (2014).
5. Gina Martinez, Shufang Li, and Chi Zhou: Wastage-Aware Routing in Energy-Harvesting
Wireless Sensor Networks. IEEE Sensors Journal. vol. 14, No. 9, (2014).
6. Mohammad Aazam, Pham Phuoc Hung, Eui-Nam Huh: Cloud of Things: Integrating Internet
of Things with Cloud Computing and the Issues Involved. In the proceedings of 11th IEEE
IBCAST. Islamabad, Pakistan, 14-18 January, (2014).
7. Mohammad Aazam, Pham Phuoc Hung, Eui-Nam Huh: Smart Gateway Based Communication for Cloud of Things. In the proceedings of IEEE ISSNIP. Singapore, (2014).
8. Mohammad Aazam, Eui-Nam Huh. Fog Computing and Smart Gateway Based Communication for Cloud of Things. in the proceedings of IEEE Future Internet of Things and Cloud
(FiCloud). Barcelona, Spain, (2014).
9. Ruben V. den Bossche: Cost-Efficient Scheduling Heuristics for Deadline Constrained
Workloads on Hybrid Clouds. CloudCom, 320--327 (2011).
10. Le Xuan Hung, Sungyoung Lee, Phan Tran Ho Truc, and et al: Secured WSN-Integrated
cloud computing for u-Life care. 7th IEEE Consumer Communications and Networking
Conference. USA, (2010).
11. Carlos Oberdan Rolim, Fernando Luiz Koch, and at al: A cloud computing solution for
patient's data collection in health care instititions. Second International Conference on
eHealth. Telemedicine, and Social Medicine, (2010).
12. Joel Wolf: SODA, An Optimizing Scheduler for Large-Scale Stream-Based Distributed
Computer Systems. International Conference on Middleware, pp. 306-325, (2008).
13. Oliver Sinnen and Leonel A: Communication Contention in Task Scheduling. IEEE Transactions on Parallel and Distributed Systems, Vol. 16, No. 6, (2005).
14. L. Zeng: Budget Conscious Scheduling Precedence-Constrained Many-task Workflow
Applications in Cloud. AINA, (2012).
15. J. Li, S. Su: Cost-Conscious Scheduling for Large Graph Processing in the Cloud. IEEE
International Conference on High Performance Computing and Communications. (2011).
16. Micro Data Center, http://www.astmodular.com/solutions/family/micro-data-center_2,
(2014).
17. Pham Phuoc Hung, Eui-Nam Huh, A New Approach for Task Scheduling Optimization in
Mobile Cloud Computing, The FTRA 2014 International Symposium on Frontier and Innovation in Future Computing and Communications, 2014
18. Pham Phuoc Hung, Bui Tuan-Anh, Eui-Nam huh: A solution of thin-thick client collaboration for data distribution and resource allocation in cloud computing. International Conference on Information Networking (ICOIN), 2013.