Czech Technical University in Prague Faculty of Information

Czech Technical University in Prague
Faculty of Information Technology
Department of Computer Systems
Market-based Resource Allocation in Non-Dedicated Clusters
by
Ing. Michal Košťál
A thesis submitted to
the Faculty of Information Technology, Czech Technical University in Prague,
in partial fulfilment of the requirements for the degree of Doctor.
PhD programme: Informatics
Prague, September 2012
ii
Thesis Supervisor:
Prof. Ing. Pavel Tvrdı́k, CSc.
Department of Computer Systems
Faculty of Information Technology
Czech Technical University in Prague
Thákurova 9
160 00 Prague 6
Czech Republic
c 2012 Ing. Michal Košťál
Copyright iii
Abstract and contributions
Fast progress in computer technologies allows building personal computers and workstations that are very cheap and powerful at the same time. However, their utilization is
often very low. The concept of non-dedicated clusters allows joining such poorly utilized
computers into one virtual cluster machine that can be used for high-performance parallel
and distributed computing. The decentralized resource allocation in such systems is a hard
problem; classical centralized scheduling algorithms cannot be used. A promising approach
is to use microeconomic mechanisms.
We have designed and implemented a micropayment infrastructure, an auction and
an auditor mechanism on top of the non-dedicated cluster architecture CLoNDIKe. The
resource holders can join a cluster and profit from selling its resources. This profit can
later be used to allocate other resources contributed by other participants in a cluster.
Such an approach leads to high utilization. The satisfied users will cultivate the cluster,
without the need of some central authority. In this thesis, we investigate a market-based
process scheduling in a non-dedicated cluster. We discuss the principles of such solution and
describe the architecture and the decisions during its design, the prototype implementation,
and results of experiments.
In particular, the main contributions of the doctoral thesis are as follows:
1. Design of a framework for market-based resource allocation in non-dedicated clusters.
2. A functional prototype and evaluation of the design by several sets of experiments.
3. Proposal of an auditing mechanism that can establish a trust between independent
nodes.
4. Description of a distributed multi-level architecture based on the principles proposed
by the thesis.
Keywords:
digital economy, non-dedicated clusters, high-performance computing, microeconomic
resource allocation, market-based resource allocation
iv
Acknowledgements
First of all, I would like to express my gratitude to my dissertation thesis supervisor, Prof.
Ing. Pavel Tvrdı́k, CSc. He has been a constant source of encouragement and insight during
my research and helped me with numerous problems and professional advancements.
Special thanks go to the staff of the Department of Computer Systems who maintained
a pleasant and flexible environment for my research. I would like to express special thanks
to the department management for providing most of the funding for my research. My
research has also been partially supported by the Ministry of Education, Youth, and Sport
of the Czech Republic under research program MSM 6840770014, and by the Czech Grant
Agency under grant No. 102/06/0943. I would also like to thank my colleagues from
Parallel Computing Group, for their valuable comments and proofreading.
Finally, my greatest thanks go to my family members for their infinite patience and
care.
Contents
1 Introduction
1.1 Motivation . . . . . . . . . . . .
1.1.1 The CLoNDIKe Overview
1.2 Problem Statement . . . . . . .
1.3 Contributions of the Thesis . .
1.4 Structure of the Thesis . . . . .
.
.
.
.
.
1
1
3
4
5
5
.
.
.
.
.
.
.
6
6
7
7
8
8
9
9
3 Objectives of the Thesis
3.1 Scheduling and Price Estimation . . . . . . . . . . . . . . . . . . . . . . .
3.2 Monetary system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
10
10
10
4 Overview of Our Approach
4.1 Design Criteria for Digital Market
4.2 The Proposed Architecture . . . .
4.3 Bidding Strategies . . . . . . . .
4.3.1 Constant-price Strategy .
4.3.2 Escalating-price Strategy .
4.3.3 Minimal-price Strategy . .
4.4 Auditing Mechanism . . . . . . .
4.5 Distributed Architecture . . . . .
12
12
13
16
17
17
18
18
20
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2 Background and State-of-the-Art
2.1 Enterprise (1988) . . . . . . . . . . .
2.2 Ferguson et al. (1988) . . . . . . . .
2.3 Spawn (1992) . . . . . . . . . . . . .
2.4 Tycoon (2004) . . . . . . . . . . . . .
2.5 Mirage (2005) . . . . . . . . . . . . .
2.6 GRid Architecture for Computational
2.7 Virtualisation technologies . . . . . .
.
.
.
.
.
.
.
.
v
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
Economy (GRACE)
. . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
vi
CONTENTS
5 Experiments and Evaluation of Results
5.1 Auction Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.1 First-price, Sealed Bid vs. Second-price, Sealed Bid Auction
5.1.2 Auction-based vs. Standard Processor Allocation . . . . . .
5.1.3 Constant-price Bidding Strategies . . . . . . . . . . . . . . .
5.1.4 Minimal-price vs. Constant-price Strategy . . . . . . . . . .
5.1.5 Minimal-price vs. Constant vs. Escalating Strategy . . . . .
5.1.6 Escalating Strategy vs. Minimal Strategy . . . . . . . . . . .
5.1.7 Escalating vs. Constant Strategy . . . . . . . . . . . . . . .
5.1.8 Computational Overhead of the Auction Mechanisms . . . .
5.2 Auditing Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Distributed Architecture . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1 Bidding Strategies . . . . . . . . . . . . . . . . . . . . . . .
5.3.2 Scalability and Heterogeneity . . . . . . . . . . . . . . . . .
5.3.3 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
23
23
23
24
25
28
29
32
36
39
39
43
43
44
46
6 Conclusion
6.1 Fulfillment of Objectives . . . . . . . .
6.2 Scheduling and Price Estimation . . .
6.3 Monetary system . . . . . . . . . . . .
6.4 Migration . . . . . . . . . . . . . . . .
6.5 Summary . . . . . . . . . . . . . . . .
6.6 Future Work . . . . . . . . . . . . . . .
6.6.1 Other Kinds of Resources . . .
6.6.2 Non-dedicated Cluster Platform
6.6.3 Security and Trustworthy . . .
6.6.4 Other Market Mechanisms . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
48
48
48
48
49
49
50
50
50
50
51
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Bibliography
52
Publications of the Author
56
Chapter 1
Introduction
1.1
Motivation
It is quite common to see a powerful computer lying idle on someone’s desktop, fully
utilized only for few minutes in a day. In general, the utilization of desktop computers is
very low.
The idea to utilize idle cycles of these computers has lead to systems such as the
SETI@home project [38] or the Condor system [41]. A user can join the system and
provide its computer’s idle cycles. In SETI@home, idle computers over the Internet run
a special downloaded program. The Condor can run master-slave parallel programs in
clusters. These systems create only a little motivation for participants and suffer with a
well known ”tragedy of common”. Hard programming model, low generality and central
management of these systems lead to approaches that use market-based algorithms. First
experiments indicate that these mechanisms can really lead to great results.
More universal architectures are grids [19] or non-dedicated clusters. Both are aimed at
interconnected collections of heterogeneous nodes. Grids represent a platform for pooling
a large amount of resources of different types, even in a world-wide scale [34]. Even
though the distinction between grids and non-dedicated clusters has various interpretations,
non-dedicated clusters typically implement the single system image shared between the
participants. As a result, a set of programs that can be run in such a system is wider.
Non-dedicated clusters [24, 6, 41, 1, 43, 28, 25] promise high utilization of independent
heterogeneous computers, while single system image [33] promise easier management of
program executing environment. Anyone can join a non-dedicated cluster and offer its
computing resources in an economical manner [14, 46, 10, 35, 8]. A participant lusting for
a computational resource can bid for any available resource on an auction [9, 14, 39], and,
after successful bidding, to use them for its own computation. A bid price that it offers
on an auction is based on her own priorities. In other words, the bid price represents her
estimation od a target resource. A market environment motivates cluster contributors to
take care of their resources and to utilize them in an economically effective manner.
In clusters with dedicated computers and central management, one can trully estimate
1
CHAPTER 1. INTRODUCTION
2
values of particular nodes, even if they are heterogeneous. On the other hand, a nondedicated cluster consists of mutually untrusted nodes with ordinary operating systems
and local independent management [24]. In such an environment the precise evaluation
requires trustworthy cooperation between participants. This is quite improbable because
every participant has her own individual selfish plan and if such cooperation does not bring
her any profit she has not any reason to cooperate.
When a buyer bids for a resource, he hopes that two major issues will be fulfilled:
That a data integrity will not be breached, i.e., that a purchased resource will not fake
results or manipulate data in other way than described in program. Even in desktop
grid systems like [27, 5] a tampering can appears. Some of participants modify
or replace client software so that it returns incorrect results. These attackers are
typically motivated by getting ”credit”, improving their position in the ranked list
of participants. We expect serious breaking attempts in market-based computation
environments.
That a buyer will gain a precise assessment of a target resource. When a resource firstly
appears on market a buyer has no information about its performance and load. A
buyer, therefore, needs to get the most precise information about the target resource.
There are several standard methods to ensure data integrity [27]. Digital signatures are
often used to ”sign” data, ensuring that they were actually produced by a client software
and were not altered before it was returned to the server. Although this method prevents
data from unintentional errors, it does not prevent them from malicious intensions of a
resource owner.
Similarly, the sandboxing techniques used in XtremWeb [11] prevents damage from a
misbehavior application, but not from a malicious user.
Appropriate degree of homogeneous redundancy [42], where data are computed redundantly and their results compared, leads to a data integrity even if a malicious user appears.
Each computation is replicated to several separated computers and a result is accepted only
if results from all computers are equal.
In systems where a reward, in form of ”credit”, is gained for successfully computed
tasks [12, 5, 11, 27, 23, 37] knowledge about individual nodes performances are not necessary. It does not matter how long the task will run. On the other hand, in market-based
systems [46, 10, 35, 45, 26] computational resources are offering on an auction and individual participants bid for resources to execute their tasks on them. Every time quantum
of bought resource must be paid.
Tampering of results and modifiyng tasks data can earn a lot of money. Therefore,
buyer must be very careful where he wants to execute her tasks. Although the security
as such is not possible because the code of the task and its data can always be revealed,
most of applications need only a certain degree of security to be satisfied. Most of them
need only to ensure that a target resource will not tamper the results and that it will use
as much of CPU time as declared before transaction.
CHAPTER 1. INTRODUCTION
3
Because buyer and seller are placed in an economic environment, a seller tries to sell
its resources for as much as possible. Therefore a buyer cannot fully trust its information
about performance and load. Instead a buyer must use a control mechanism to adjust
these reports.
In our system, we proposed an auditing mechanism that allows a user to correct information from a seller and so establish suitable prices for computation resources in an
untrusted economical environment. In the process of price evaluation, a user can easily
detect resources that behave incorrectly and should be black-listed.
We integrated an auditing mechanism into our framework described in the Section 4.4.
1.1.1
The CLoNDIKe Overview
CLoNDIKe [24] (Cluster of Non-Dedicated Inter-operating Kernels) is a system joining Linux
workstations into one non-dedicated cluster. The ambition of the project is to utilize idle
personal computers connected to a cluster.
The important feature of the CLoNDIKe architecture is the separate administration of
individual workstations connected to the cluster so that the workstation owners retain full
control over their resources, without the need to share an administration with someone
else.
A CLoNDIKe cluster is composed of two types of computing nodes: core nodes and
detached nodes. Core nodes form the cluster backbone. A core node is fully dedicated to
the cluster and it cannot be used as a regular workstation simultaneously. It is controlled by
the cluster administrator and can to certain extent be considered trustworthy. The main
role of a core node is to provide basic cluster services and features, such as distributed
filesystem, process space, user accounts, and access rights.
All the remaining nodes are called detached nodes. A detached node can serve as a
regular workstation and have its own administrator. It can be connected to or disconnected
from a cluster at any time without negative impacts on the cluster. These workstations
run their own Linux operating systems and applications and one important criterion of the
CLoNDIKe design is to minimize for the local workstation users the impacts of participation
in a cluster.
Every process running in CLoNDIKe cluster environment is related to two nodes, potentially different. The first one is the core node where the process appears to be running,
which is in this context referred to as the home node of the process. The second one is its
execution node, which is the real location of the process’s execution. This may be either a
core or a detached node.
To exploit all cluster resources and to balance the load of all nodes, processes can be
migrated transparently between nodes. There are two types of process migration: preemptive and non-preemptive. While the former can move a running process virtually in any
situation, the latter is restricted to the moment of the execve() system call.
A process is always started at a core node that acts both as its home and execution
node. Depending on the CLoNDIKe system load, the process might need to be migrated to
its new execution node, where it is dubbed a guest process (to emphasize that it is ”hosted”
CHAPTER 1. INTRODUCTION
4
by the new execution node). Before that, a special process, called shadow process, is started
at the home node. It acts as a deputy for interaction between the guest process and its
home node environment and performs some actions on behalf of the guest process running
remotely.
It is obvious that preemptive migration is more complex than non-preemptive migration. It is an important building block for the desired dynamics of a cluster. Simply
speaking, it involves creating a checkpoint of the process to be migrated and executing it
at a different cluster node. The detailed description of the mechanism is beyond the scope
of this text and is explained in [44].
There are several projects similar or related to CLoNDIKe. The most notable examples
are OpenSSI [33], Mosix [6], openMosix [32], and those already mentioned above, the
SETI@home project [38], the Condor system [41], and the grid systems [19]. However,
there are some important differences between these and CLoNDIKe, described in [24]. These
systems are either implemented at a different architecture level, or are built of dedicated
computers, or have a different application domain, or require programs to be relinked with
special modules.
1.2
Problem Statement
A resource management is the core function of a non-dedicated cluster such as CLoNDIKe.
To adjust to the rule of decentralized management of workstations, we have designed and
implemented market-based resource allocation mechanisms.
We have expected that the use of market-based mechanisms for an allocation of cluster
resources leads to more efficient system in many ways [31]:
Effective behavior patterns. If users have to pay for the resources their processes use,
they will behave economically. They will buy the resources for the lowest available
price and therefore, they will hold the execution of their processes to the time when
the resource price is appropriate to the assigned process priority. A wasteful usage
of the system will be penalized by no profit, while a careful use will be rewarded by
a higher profit.
Incentive for workstation owners. If workstation owners are paid for offering their
resources, they will be motivated to do so. This, rather than their altruism, can be
relied upon as an effective incentive. Renting one’s idle resources commercially is an
easy business opportunity and if there is a sufficient demand, adequate supply will
follow.
Prices reflecting cost. With multiple workstation owners connected to the system, one
can expect competition to arise. This will make prices of resources to converge to
their real cost. If there is unsatisfied demand for resources, their price will rise,
lowering the demand and increasing the supply until they meet. At this point the
price of the resources will reflect their real market cost. If on the other hand there
CHAPTER 1. INTRODUCTION
5
is an excess of supply, one can expect resources to be offered for free (or for a price
that covers the overhead costs of the workstation owner).
Effective and automatic load balancing. If users (or the software agents representing
them) are sensitive to opportunity, the nodes with best performance will be the most
prestigious ones and hence the most expensive ones. More important (and hence
more solvent) process managers will fight for an execution of their processes at these
prominent nodes. Less important (poorer) processes will be unable to keep up this
price race and will migrate to cheaper nodes, balancing the load over the system
effectively and automatically. The least important processes might even have to
swap out and wait until the price drops to a level they are able and willing to meet.
Fairness. Since prices reflect the costs, the system is fair. The buyer should pay and the
seller should receive the real value of what has been purchased.
1.3
Contributions of the Thesis
In particular, the main contributions of the doctoral thesis are as follows:
1. A design of a framework for market-based resource allocation in non-dedicated clusters.
2. A functional prototype of the framework and proof of concept of the framework
design criteria by conducting a several series of experiments.
3. A design and verification of an auditing mechanism that can establish a trust between
independent and eliminate the influence of potentially untrustworthy or malicios users
in the environment of non-dedicated clusters.
4. A design of a distributed multi-level architecture based on the principles developed
in the thesis.
1.4
Structure of the Thesis
The thesis is organized as follows. Chapter 2 summarizes the previous results. Chapter 4
describes our solution of the problem. Chapter 5 discusses the achieved results. Chapter
6 concludes with outlines for future work.
Chapter 2
Background and State-of-the-Art
Computing in general can be classified into two kinds: centralized and distributed. In
centralized computing all computing is controlled through central terminal servers, which
centrally provides the processing, programs and storage. The workstations are used only
for input and display purposes. They connect to the servers where all tasks are performed.
Resources like memory, disk space, processor time slices are allocated for individual computings using centralized resource allocation algorithm.
In distributed computing there are not central terminal servers, rather all workstations
can be used to provide resources needed for computing. As the individual workstations
become so cheap that they were massively used in home there become a first approaches
to utilize them when there are not used by the owner. The systems like SETI@home
project [38] or the Condor system [41].
Later more universal architectures as grids [19] or non-dedicated clusters evolved. Both
are aimed at interconnected collections of heterogeneous nodes. Grids represent a platform
for pooling a large amount of resources of different types, even in a world-wide scale
[34]. Even though the distinction between grids and non-dedicated clusters has various
interpretations, non-dedicated clusters typically implement the single system image shared
between the participants. As a result, a set of programs that can be run in such a system
is wider.
It has been found that traditional resource management techniques, mainly used for
centralized systems, are not suitable for grids and non-dedicated clusters architecture. One
of the interesting approach is using market mechanisms we learn from economy domain [7].
The progress in the design of market based resource allocation systems has a long
history. Although there are many related projects [9, 10, 35, 8, 29, 16, 45, 25, 40, 47, 4],
we discuss those most related to our CLoNDIKe market-based effort.
2.1
Enterprise (1988)
One of the first systems, implementing the ideas of market management in the distributed
environment on the modern computers, was the Enterprise [29]. Its main purpose was
6
CHAPTER 2. BACKGROUND AND STATE-OF-THE-ART
7
to balance the workload on personal workstations connected by a local area network. It
includes a scheduling protocol whose task is to assign each process to the best machine
available at any given time. The “Distributed Scheduling Protocol” (DSP) is used for
communication between processors and revolves around a three-step sequence: request —
bid — contract.
When a user wants to execute a task, its CPU (the client) broadcasts a “request for
bids” to the other machines in the network. This request includes a numerical priority of
the task to be executed. The idle machines in the network (the contractors) may respond
with their “bids” giving their estimated completion times. The client then evaluates these
bids and proceeds to the third step of the sequence, it sends the task to the best bidder
and cancels messages to the rest of the bidders. If a later bid is significantly better than
the best early one, the client may cancel the task on the early bidder and send it to the
new contractor.
The scheme has two weaknesses: it requires the contractors to estimate their processing
times and the clients to submit their priorities. Disregarding the technical burden of these
requirements, they offer a threat to the very functionality of the system. If a selfish user
decides to abuse the system for her own interest, the tools are in her hands. Thus, the
whole scheme counts heavily on the honesty of the parties involved. This fact makes it
inapplicable in a system where selfish users pursuing their selfish goals may exist.
2.2
Ferguson et al. (1988)
The influential model of microeconomic algorithms for load balancing in distributed systems was proposed in [16]. This system solves the problem of load balancing by using an
auction for competition between two types of agents: the processor agents selling their time
slices and the process agents buying the available resources. While a processor agent tries
to sell her resources for the highest price, while on the other hand a process agent tries
to buy the needed resources for the lowest price. The system does not allow preemptive
migration.
2.3
Spawn (1992)
The Spawn [45] continues in this tradition, while introducing couple of new concepts.
It understands the distributed computations, represented by the collection of interacting
processes, as computational ecosystems [22, 21], following the concept of open systems[20].
The Spawn firstly demonstrates that the use of monetary funding as priority can be
very effective and that the price information can be used to control adaptive expansion
and contraction of process trees in concurrent applications.
Although we were inspired by Spawn in many ways, the Spawn fails to address certain
requirements, essential for a practical distributed computational system.
As it is executed like an ordinary user-mode UNIX application, Spawn has no support
CHAPTER 2. BACKGROUND AND STATE-OF-THE-ART
8
for preemptive process migration. Once a process is run on one of the computers, it has
to stay there until the computation is over. Just like with Enterprise, it could choose to
stop its execution and restart at a different network node. However, if multiple processes
co-operate in a parallel computation, restarting one of them is unfeasible. The resources of
concurrent co-operating processes are allocated in an uncorrelated manner. This is a significant limitation to the potential use of Spawn as the host of general parallel computation,
restricting the parallel applications to coarse-grained asynchronous ones requiring little or
no communication (e.g., Monte-Carlo simulations). Last but not least, Spawn is insecure.
The authors concede they have made no attempts to prevent malice in the system. This
leaves the opportunity for the malicious users to tamper with the economy by forging the
currency or using cheating agents.
In market-based CLoNDIKe, these issues were the key design requirements.
2.4
Tycoon (2004)
The importance of market-based resource allocation systems is increasing, as the research
efforts of large ICT companies (like Sun Microsystems, IBM, Intel, HP) indicate. Their
vision is to create a cluster system which performance can be commercially distributed
among a high number of third-party users. They are conducting research of the marketbased algorithms suitable for rapidly changing and unpredictable demands and test these
algorithms from the viewpoint of economic efficiency, utilization, risk, and fairness. The
typical representative, the Tycoon [25, 26] was designed at HP Labs.
The architecture follows much along the same lines as the one of Ferguson et al., making
use of both process and processor agents. The processor agents (often referred to as
auctioneers) perform auctions in which they sell processor-time slices to the process agents
(or user agents). The model comprises the following key agents: Service location service,
bank, auctioneer, and user agent.
Although the Tycoon proposes similar concepts to ours, its goal is to provide the trusted
cluster resources to the third-party users. In contrast to it, we proposed a mechanism usable
for joining potentially untrusted participants in non-dedicated clusters. This fundamental
difference implies many contrasts between our and the Tycoon design.
While we allow any user to participate with its resources in the market business, in
Tycoon the user can only buy the resources given by the system. In the systems where
selling parties cannot be fully trusted, the whole design must be stressed on security,
beginning from the micropayment scheme, auction mechanism, to determining the resource
qualities. Those are not needed in Tycoon.
2.5
Mirage (2005)
The last representative, from the family of market-based solutions to the problem of distributed resource allocation, is the project Mirage [13]. It was designed for allocating the
CHAPTER 2. BACKGROUND AND STATE-OF-THE-ART
9
wireless SensorNet testbeds, allowing users to compete for testbed resource submitting bids
which specify resource combinations of interest in space/time. It includes a specific bidding
language, which is expressive enough to capture the full range of resource requests that
users would like to make. Its microeconomic allocation scheme is based on the repeated
combinatorial auction.
In contrast to our design, where we expect often fine-grained resource allocation for
short-lived application, the Mirage was designed for SensorNet applications, which take
much longer to complete and are considerably more costly. This implies the differences of
the market algorithms and the scale of these two systems. The Mirage infrastructure is
also much less automated, thus involving the human user in the market decisions.
2.6
GRid Architecture for Computational Economy
(GRACE)
Grace [10] is a generic distributed grid architecture for computation economy. Using the
basic key components it provides a platform to accomodate different economic models used
for resource trading. The key components of the Grid include the followin:
1. Grid User With Applications
2. Programming Environments
3. User-Level Middleware and Tools
4. Core Grid Middleware (services for resource trading and coupling distributed wide
area resource)
5. Grid Service Providers
Grace defines services that help both resource owners to maximize profit from resources
and users to minimize computing time and price for required resources. The resource
brokers work for the consumers and attempt to maximize user utility, while resources
traders offers given resources according to defined policies.
There are many grid resource allocation techniques including various type of auction
models as English auction, Dutch auction or Double action.
2.7
Virtualisation technologies
The fast broadening of virtualisation solutions (such as VMWARE [2] and cloud service
(such as Amazon Cloud [3]) allows migration of unmodified applications encapsulated in
virtual machines. Using modern virtualisation techniques makes possible to encapsulate
whole VM state such as opened network connections, CPU state, memory state and I/O
operations. VM can be easily saved and transfered to other workstation that has compatible
virtualisation technique.
Chapter 3
Objectives of the Thesis
The main objectives of the Thesis is to create a computational framework within designed
architecture that will meet the conditions discussed in following sections.
3.1
Scheduling and Price Estimation
Scheduling of resources in non-dedicated cluster should be user-centric, i.e. user must be
able to define priorities witainh her own tasks, while she must be able to negotiate with
other users priorities for all executed tasks in the cluster.
In market-based environment can be priorities adjusted by the prices of available resources. A user can overpay others users to prioritize her own tasks. The basic technique
for negotiation between independent clustrer contributors are auctions. Auditing system
is used for evaluating the utilization of available resources. The purpose of audit is to
appreciate clusters resources so their buyers can choose the right price. To explore various
types of auctions and integrate them into the architecture is an objective of this thesis.
3.2
Monetary system
For proper use of auctions the monetary system is required. There is a need of authorized
currency emiting. There can be various configurations of monetary systems. Depending
on the context or part of the architecture one can create an open or close monetary system
with its own currency for trading. Designing of proper monetary system for the final
architecture is one of the objectives of this thesis.
3.3
Migration
Resource allocation of multiple resources distributed in cluster requires a migration mechanism to move a task from one node to another. The mechanism must allow to save the
10
CHAPTER 3. OBJECTIVES OF THE THESIS
11
exact state of the task into a checkpoint. Such checkpoint can then be moved from one
node to another and then resumed.
Chapter 4
Overview of Our Approach
4.1
Design Criteria for Digital Market
When designing the market-based resource allocation system for non-dedicated clusters,
we have complied with several design requirements:
Transparent process execution. The CLoNDIKe, from the user viewpoint, is a virtual
computer with varying computation capabilities. The interconnections between individual computers joined in a cluster are hidden to her. On the other hand, the
market-based allocation subsystem should be transparent to the process, therefore a
user does not need to recompile or relink a running process. The user only chooses
the agent that will work on her behalf on the market.
In fact, there are two types of agents: a processor manager, selling computer resources, and a process manager, buying those resources on the market. We consider
auctions as basic market mechanisms. The separation of process execution from
market strategies allows a user to change the strategies in dependence on an individual market environment. Different agents can allow different business behaviors.
A variety of different strategies should be designed, implemented, and provided to
users. The system is designed in such a way that users can implement their own
agents with desired market behaviors.
Replacement of the priorities by monetary funding. In a classical distributed system, the assignment of a priority to a process in some global manner is hard, even
impossible. Even if a user were motivated to assign it as correctly as he knows,
he cannot know the priority levels of the other users. Therefore in such systems, a
central authority governing the policies of priorities is needed. However, for large
systems, this is problematic. In contrast, markets allow users to follow their own
business, offering their own prices for resources, thus evaluating their own priorities.
If a process has a low priority, i.e., is less funded, it has less chance to get a processor.
There is no need for a central authority, assigning prices to resources.
12
CHAPTER 4. OVERVIEW OF OUR APPROACH
13
We designed a micropayment scheme to allow transferring the currency between
two participants in the CLoNDIKe system. Based on the experience of other researchers [30], we expect small amounts of money to circulate in the system. The
overhead of a money transfer must be significantly smaller than the amount being
transferred. We designed a similar paying scheme to the one used in Tycoon [26]
(See Section 2.4 for more details).
The mediator between the resource sellers (processor managers) and resource buyers (process managers) is the auction. Many auction schemes [36], with different
characteristics, suitable for different cases were invented.
We designed an auction module to experiment with two different auction schemes:
the first-price sealed bid and the second-price sealed bid auction (also known as Vickrey) auction. The latter seems to provide better results, as also stated in previous
papers [16, 17].
Minimal computational overhead. An important design criterion was the minimal
computational overhead of the market mechanism, not only on individual workstations, but also on core nodes.
Security. A market system must always guarantee a correct behavior to allow effective
cooperation of individual participants that follow their own interests.
Because there is a currency circulating in the system, security issues have to be
considered carefully. The processes and their communication may be tampered within
the exposed environment of untrustworthy workstations.
Modularity and scalability. The system will presumably serve as a testbed for nontraditional operating-system paradigms. For this purpose, it should be designed in a
modular way to allow testing of different components of each type. Also, the system
should be efficiently scalable to hundreds of nodes.
4.2
The Proposed Architecture
We have designed a digital market depicted in Fig. 4.1, see papers [A.4, A.3, A.2, A.1]. It
can be divided into three main parts. Trading mechanism is used by participants to trade
computational resources. Additional components and mechanisms supporting the market
are needed to ensure the appropriate requirements for digital market. These include the
monetary mechanisms, mechanisms to ensure security, mechanisms to ensure availability
of services, and others.
In this design of a digital market, we chose an auction mechanism as a basic mechanism
for trading computational resources between market participants: a processor manager sells
processor resources, which a process manager buys for its process. The market support
consists of a bank, yellow pages components, and an auditing mechanism.
14
CHAPTER 4. OVERVIEW OF OUR APPROACH
use
Market
Support
Buyer/Seller
buy/sell
Auction
cooperate
use
buy/sell
Buyer/Seller
Figure 4.1: Market
4. ticket
2. bid
1. tickets
Auction
Processor
Manager
Task
Manager
3. tickets
Processor
Manager
Task
Manager
fin
d
find
pay
er
ist
reg
YP
Bank
register
Figure 4.2: Architecture of the auction mechanism
15
CHAPTER 4. OVERVIEW OF OUR APPROACH
The processor manager has two tasks: to sell processor resources it owns, and afterwards, to control execution of processes that use these resources. We define tickets [18]
to represent processor quanta in some defined time moment. A ticket gives its owner the
authority to run a process on a target processor at a given time, for a given time quantum.
A processor manager must issue tickets in advance, so that they can be successfully merchandised on an auction (step 1. in Fig. 4.2). Number of tickets and lengths of the time
quanta are defined by a processor manager and they should reflect a processor load.
Of all the CLoNDIKe components, the processor manager comes closest to being an
operating-system component. It takes care of the processor allocation atop of the native
process scheduler. To achieve this, it utilizes the migration framework of the CLoNDIKe
system [44].
Bidding
n
tio
t
en
Ev
al
em
ua
nc
ou
nn
A
Ev
t
en
em
al
nc
ua
ou
tio
n
nn
A
Bidding
Figure 4.3: Auction procedure
The auction mechanism works continuously in three phases, depicted in Fig. 4.3. In a
bidding phase, an auction receives tickets from processor managers, and bids from process
managers. Bids are then evaluated in an evaluation phase afterwards and all sold tickets
are paid and sent to their winners in an announcement phase. A bidding phase of a given
round always overlaps with evaluation and announcement phases of the previous round.
The process manager supplies its process with available resources and initiates migration to the gained processor. It sends bids for resources to an auction (step 2. in Fig. 4.2).
An auction periodically evaluates them and sends the winning tickets to winners (step 3.
in Fig. 4.2). A successful process manager can afterward use gained resources for its own
CHAPTER 4. OVERVIEW OF OUR APPROACH
16
purposes. It can use them to execute a process or its checkpoint on a target processor by
sending it a ticket (step 4. in Fig. 4.2). The target processor should afterward execute the
demanded process in a time specified by a ticket.
The main (and most complicated) part of a process manager is its bidding strategy.
Any CLoNDIKe user can create its own process manager to match her specific requirements.
We have created a simple framework for experimenting with various strategies and have
realized several simple strategies that are explained below to show the correctness of the
whole concept.
A correct and effective bidding strategy should be able to adapt to a trend of prices
on an auction. An auction can therefore provide a user manager the last winning prices.
Therefore, there are various inputs that a process manager can consider: the history of
winning bids, its own history of bids, and requirements of the managed process.
A bank maintains accounts for all market participants (process and processor managers).
We use an online scheme based on bank accounts [30]. It allows transfers only between
two accounts, which allows to solve the security problem. Although the bank needs to
participate in any money transfer, it turned out that it is not a big performance problem
in practice, because of high efficiency of a bank subsystem. Also several banks can coexist
in a system, which allows to solve the problem of scalability.
A yellow pages component provides a service for advertising individual components.
Any participant, like an auction, a bank or other yellow pages, can register, allowing
a broader audience to use its service. Usually processor and process managers use this
service to find a location of proper auction or bank. Every participant registering into a
service needs to submit its certificate to verify its identity.
The Yellow Pages record the following information about every service:
• ID. Every service must have an ID unique in the system. This ID represents its
business name. The Yellow Pages will protect the ID of a service once it is registered.
• Password. The password used by the service to identify itself.
• Role. The role that corresponds to different agent types.
• Address. The current address where the service can be contacted.
A component (process manager, processor manager, etc.) can query the Yellow Pages
either by ID (if looking for a specific service) or by role (if looking for any service with a
given role).
4.3
Bidding Strategies
Process managers coming to an auction can use various bidding strategies.
CHAPTER 4. OVERVIEW OF OUR APPROACH
4.3.1
17
Constant-price Strategy
Bidding the constant price on a chosen processor is the simplest strategy which can a
process manager adopt. However, this is a risky strategy. If a process manager underestimates the price of a processor resource, it would never win an auction. On the contrary,
if it overestimates it, it will end up with paying more than necessary. We implement this
strategy using two prices, initial and host, since we distinguish two bidding situations:
1. The process is not running on any processor. Then its process manager bids with
the initial price for all active processors.
2. The process is being executed on a processor. Then its process manager bids with
the host price only for the host processor. The initial price should be lower than the
host price.
4.3.2
Escalating-price Strategy
To deal with this issue, we adopted an escalating-bid strategy. In our scheme, in contrast
to escalating-bid strategy proposed in [15], processor managers do not determine prices.
This strategy is based on continuous increasing of a bid price, until it succeeds in the
auction. The increase rate is one of the parameters of this strategy. A process manager
starts on a minimal bidding price at each processor. If the current auction price of every
processor is higher, it fails to buy any processor and increases the bid. It stops if the
process manager wins one or more processors.
Since the success in an auction cannot be turned down, the process manager has to pay
for all processors, even if it has no way to use all of them. Hence, it chooses one of them
to execute the process and stops bidding for the other processors. The process manager
keeps this last price in further bidding rounds for the same processor, until a failure of its
bidding spurs the process manager to new activity, i.e., to increase the bidding price again.
This strategy offers a number of interesting variations. If, for example, a process manager wants to lower the chance of being expelled from a processor, it may keep increasing
the bid price. When it is expelled, it might want to make a greater jump with the other
bids which are now at zero. If a user wants to prevent overspending, he should give the
processor manager an upper limit to how much it can bid.
In dependence on the price increasing rate, this strategy can lead to a quick and efficient
determination of a current auction price for a resource. One of the problems with this
strategy is the fact that it never decreases prices, i.e., it does not adapt to the changes
of the environment. Even if the demand for a processor rapidly decreases, the process
manager still bids the same price. It does not appear such disadvantageous in conjunction
with sealed bid second-priced auction because real traded prices reflect the demand better.
CHAPTER 4. OVERVIEW OF OUR APPROACH
4.3.3
18
Minimal-price Strategy
By learning from the experience with escalating-price strategy, we have designed a strategy
that utilizes a history of last successful auction prices.
When a process manager has not won a processor yet, it chooses as the next bidding
price the minimal winning processor price from the previous auction round, increased by
a constant that corresponds to the priority of the process and uses it for bidding for all
available processors.
The situation changes when a process manager wins an auction and gains some processor. In further auction round, it continues bidding on this processor for the winning
price increased by the price of the migration. However, it still continues to bid for all other
processors, in the case some cheaper processor appears. It bids for them with the price
decreased by the constant, which can be defined as a parameter of the strategy.
Intuitively, this strategy might outperform the escalating-price strategy in two ways.
From the first round, it sets a competitive bidding price, which is the minimum of the
last successful auction prices. And in contrast to the escalating-price strategy, it adapts
quickly to the changes of the market environment.
4.4
Auditing Mechanism
In an environment of a digital market we designed and integrated an auditing mechanism.
Auditing is a technique, where an auditor audits the performance and trustworthy characteristics. The result of an audit consists of a performance history, which can be used to
predict its future values, and an indicator of trustworthiness.
An auditor is process manager that repeatedly runs a special auditing process on an audited node and records its performance and correctness of results. An auditing mechanism
must, therefore, satisfy the following characteristics:
• The auditing processes should not be easily distinguishable by a processor manager
that executes it, while it still should be easy to generate. A processor manager should
not easily distinguish an auditing process because than it would easily counterfeit a
result from auditing process without executing it.
• Execution time of an auditing process must be long enough to be measured.
• Execution time of an auditing process must be predictable.
• A result of an auditing process execution must be cheap to verify by an auditor. The
predictability of an auditing process result is used to ensure data integrity. When a
result of execution of audit process does not equal to an expected one, a processor
manager, that executes the audit process is black-listed.
Meeting these criteria can lead to performance and load characteristics about an audited
node that are correct and cheap.
CHAPTER 4. OVERVIEW OF OUR APPROACH
19
Every time a processor is offered on a market, its processor manager reports a prediction
about the performance at a time the resource will be sold. This prediction should match
its real performance. The prediction can vary from real performance for two reasons:
a resource was loaded more than a seller expected, or a seller tried to gain more profit
and therefore overrated a real performance. An auditor repeatedly compares the predicted
performance with the real performance of executed auditing processes. Resulted assessment
reflects a real processor performance and can easily be compared to assessment of any other
resource.
Let R(t) denote a prediction of resource performance referred by a processor manager in
a time t and A(t) a performance measured by an auditing process, we can define auditor’s
prediction of real performance L(t) as:
L(t) = R(t) + C(t),
where C(t) is a correction function that returns a correction of a referred performance.
It aligns it to a reference value chosen by an auditor and counts any deviation from an
auditing process performance. For simplicity, we refer a correction function to dependent
only on a time, when a processor quantum was issued.
A good auditing function must fulfill the following properties:
• it should predict accurately a correction of referred values,
• it should prioritize those resources whose predictions are closer to a measured performance,
• it should disadvantage those resources whose predictions are overrated.
Our correction function is based on a difference between a real performance measured
by an auditing process and values predicted by the resource itself:
sub(t) = A(t − ts ) − R(t − ts ),
where ts is a time shift constant we predicted.
To align a performance prediction to a reference value defined by an auditor, we need
to find an average value of such difference:
Ph
sub(t−i)
avg(t) = i=1 h
,
where h is a constant that determines a time window through which an audit is performed.
The simplest correction function could, therefore, look as follows:
C(t) = avg(t).
But such a correction function would fail in an economic environment of a non-dedicated
cluster because it would give advantage to malicious users that would give an inaccurate
prediction. We, therefore, need to shift away such noise using a standard deviation:
qP
h
2
i=1 (sub(t−i)−avg(t))
.
σ(t) =
h
And a new correction function will look like:
CHAPTER 4. OVERVIEW OF OUR APPROACH
20
C(t) = avg(t) − σ(t).
Finally, we can disadvantage those predictions that overrate the performance over those
that do not:
q
Ph
W (sub(t−i))−avg(t))2
i=1
,
σ(t) =
h
where w is a penalty weight
(A(t) − R(t)) ∗ w if x ≥ 0;
W (t) =
A(t) − R(t)
otherwise.
This last modification of a correction function was used in our experiments and it shows
as sufficiently good to be used to figure up auditing results.
4.5
Distributed Architecture
We integrated discussed auditing mechanism[A.4] and auction market [A.1] into one
market-based processor allocation system aimed for a completely distributed and untrusted
non-dedicated cluster.
The resulting system was designed to fulfill the following requirements:
1. The system should be scalable over the Internet or local computer networks. It should
scale to any number of auctions, process managers, processor managers, banks, yellow
pages, and auditors.
2. Not only the Internet but also the local networks can contain various types of computing resources with different qualities, therefore, the system should inherently support
the heterogeneity of computing resources.
3. Wide variability and unpredictability of user expectations demand a sufficient degree
of flexibility to be offered by the system. The possibility of choosing appropriate
market mechanisms (bidding strategies and auctions, banks) must be essential to the
system.
4. The system must be robust and secure enough to sustain various behavior patterns
of malicious users and hackers.
In this thesis, we describe the final integrated system. To meet the requirement of
scalability, we have designed the system as a 2-level distributed architecture, see Fig. 4.4.
The support for heterogeneity of processors is currently limited to the IA-32 architecture,
due to the current state of the CLoNDIKe project implementation. However, the integrated
system has basically unlimited flexibility on the system level, see below. The system allows
each user to create its own process manager, to choose an auction and bidding strategy,
its own bank, and its own auditor and these choices can dynamically change in time. We
have designed and implemented a parametrizable auditing algorithm able to check the
correctness of the computing results and to estimate the real performance of the processor.
21
CHAPTER 4. OVERVIEW OF OUR APPROACH
The integrated system allows the cooperation of various diverse markets, each of these
containing of its own trading mechanism and computing resources.
Cluster
Processor
Manager
Processor
Manager
Bank
Processor
Manager
LAN
YP
Processor
Manager
YP
Processor
Manager
Au
YP
cti
on
Bank
Auc
tion
Processor
Manager
Auditor
Internet
Processor
Manager
Processor
Manager
Bank
Auditor
Figure 4.4: The 2-level distributed architecture.
Let us briefly describe the heterogeneity of our system. Fig. 4.4 describes an example
of the 2-level distributed non-dedicated system that consists of several local markets, subclusters, connected by any computer network at the physical level and by yellow pages at
the system level. Each of the subclusters has an independent single system image, shared
among local market participants. Every such a subcluster can offer an independent market
mechanism to trade the computing resources.
As seen on Fig. 4.4, the processor managers are the most important members in every
subcluster. In our example, we have three types of subclusters. The most general usage of
our system is in volunteer computing over the Internet. The various separated users join
a non-dedicated cluster and share a given system image. They do not know about each
other and they do not know about the computing resources they offer. Therefore, there
must exist a good and efficient auditing infrastructure. The users will probably use more
CHAPTER 4. OVERVIEW OF OUR APPROACH
22
independent banks that will cooperate together.
The second type of subclusters is a typical local area network (LAN). Here, a configuration of a subcluster will be probably more stable. When the users of such a network
constitute some corporate group, more rules will be adhered automatically and, therefore,
there will be fewer requirements for auditing, and also only one bank might be needed.
The flexibility of our framework allows even integrating the third type of subcluster,
which is a dedicated cluster. For example, a high-performance cluster with a shared filesystem can be offered in this way. Dedicated algorithms to determine the price or even more
conventional schedulers can be used here. In fact, a dedicated cluster can act like one
complex computing resource.
Chapter 5
Experiments and Evaluation of
Results
5.1
Auction Mechanism
To verify the functionality of the proposed market-based resource allocation mechanisms,
we have built a prototype infrastructure and conducted many experiments to prove the
correctness and efficiency of the implemented prototype and proposed mechanisms, algorithms, and strategies.
The prototype infrastructure consisted of 11 single-processor homogeneous nodes with
processors INTEL Pentium III 733 MHz, 256 MB RAM. The current experimental
CLoNDIKe implementation allows using only one core node. Therefore, the core node hosts
an auction, yellow pages, and a bank, while the detached nodes host the processor managers. Based on multiple of experiments we choosed the following parameters: The period
of the auction is 3 seconds and each processor manager issues exactly 1 ticket for every
such period, 6 seconds ahead. These numbers are based on number of experiments and
showed as good set of parameters.
5.1.1
First-price, Sealed Bid vs. Second-price, Sealed Bid Auction
In our prototype, we have experimented first with auction algorithms for trading computational resources among market participants. We compared two slightly different auction
schemes:
• First-price, sealed bid auction, where a winning bidder pays exactly the amount he
bid.
• Second-price, sealed bid auction (Vickrey), where a winning bidder pays a price equal
to the second-highest bid.
23
24
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
average cpu price
35
Second-price (Vickrey)
First-price
30
25
20
15
10
5
0
0
20
40
60
80
100
time [s]
120
140
160
180
Figure 5.1: Comparation of first-price and second-price (known as Vickrey), sealed bid
auctions.
In Fig. 5.1, we compare average prices for processor resources resulted from an auction using the same constant-price strategy in process managers. We can see that the
prices resulted from the Vickrey auction are lower than using the first-price auction. The
prices also relate more to a real market demand; less demand than offer results into zero
price. We therefore chose it as the reference implementation, while planning more thorough
experiments using various schemes of auctions.
Auction-based vs. Standard Processor Allocation
4500
4000
3500
3000
2500
2000
1500
1000
500
0
Auction
Sequential
FCFS
seconds [s]
seconds [s]
5.1.2
500
450
400
350
300
250
200
150
100
50
Auction
Sequential
FCFS
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
processors
processors
(a) Execution times of processes with run- (b) Execution times of processes with running times 150–250 seconds
ning times 1–3 seconds
Figure 5.2: Comparation of 3 different scenarios: sequential execution, First Come First
Serve policy, and auction mechanism.
In the first set of experiments, we measured the total execution time for a set of processes
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
25
with various running times on p processors (p = 1, . . . , 10). We compared 3 scenarios:
1. Sequential execution of all processes on one processor.
2. First come first serve (FCFS) policy executed on the core node which takes process
from the queue and executes them remotely on the first available detached node.
3. Auction mechanism run on the core node. We used the Vickrey auction and the
process managers with constant-price bidding strategy.
Fig. 5.2(a) illustrates the results of experiments for 20 processes with running times in
the range of 150-250 seconds. The difference between the FCFS and auction-based solution
execution times is constant, about 240 seconds for p = 1, . . . , 10 processors. It is caused
by the 3 second delay of the auction mechanism. Due to small number of processes with
relatively high running times, the delay caused by the auction had a little impact on the
total execution time.
Of course, the impact of the auction on the total execution time grows if the workload
consists of shorter processes. The extreme case is shown in Fig. 5.2(b). We ran 40 processes
with very short running times in the range of 1–3 seconds. Although the auction overhead
has a clear impact on the total execution time, still an average process waiting time is only
about 2 seconds, which seems reasonable. However, such a process workload is unlikely in
practice.
5.1.3
Constant-price Bidding Strategies
The second set of experiments illustrates the results of competition of processes for processors using 3 constant-price strategies with 3 different initial and host prices as follows:
1. Strategy 1. The initial price is 3 and the host price is 5.
2. Strategy 2. The initial price is 6 and the host price is 8.
3. Strategy 3. The initial price is 9 and the host price is 10.
This experiment basically shows how the auction-based mechanisms substitute the role
of classical process priority. That is, one strategy corresponds to one priority, strategy 1
to the lowest priority 1, strategy 3 to the highest priority 3.
On each of ten detached nodes 20 process managers were run that were randomly
assigned strategies 1-3. Each process manager took care of 1 process with random running
time from the interval 15 to 25 seconds.
26
all processes
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
70
60
50
40
30
20
10
0
Strategy 3
Strategy 2
Strategy 1
executing processes
0
100
200
300
400
500
600
time [s]
(a) The processes executed in the system.
10
700
800
Strategy 3
Strategy 2
Strategy 1
8
6
4
2
0
0
100
200
300
400
500
600
time [s]
(b) The number of currently running processes over the time.
700
800
27
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
200
waiting
executing
180
160
140
process id
120
100
80
60
40
20
0
0
100
200
300
400
500
600
time [s]
(c) Behavior of individual processes.
700
800
28
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
average cpu price
12
Strategy 3
Strategy 2
Strategy 1
10
8
6
4
2
0
0
100
200
300
400
500
600
time [s]
(d) The average price of processors over the time.
700
800
Figure 5.3: The competition of 3 constant-price strategies with 3 different bidding prices.
Fig. 5.3, shows the resulting behavior of the market systems from different perspectives.
The Fig. 5.3(a) shows the number of running processes in the system including both waiting
processes with no processor quanta gained and processes running on a host processor. In
contrast, Fig. 5.3(b) shows only processes executed on a host processor. The measured
history clearly indicates that in contrast to processes with lower priority, processes with
highest priority have lowest bidding phase and the total execution time. It is clear that the
worst execution characteristics have the processes with lowest priority, i.e., more funded
processes are executed more quickly than those with less funding.
Of course, the better throughput must be paid off as illustrates on Fig. 5.3(d). The
process managers with works with highest priority strategy paid more than the other ones.
Therefore, the user must carefully choose the funding strategy for its processes with respect
to its own requirements.
The execution of individual processes is shown in the execution diagram showed by
Fig 5.3(c). It shows the length of the phase, when process is executed on host processor
(the solid line) and the phase when it is not (the striped line). The diagram clearly
indicates that the bidding phase of the processes with lowest priority (in the lowest part
of the diagram) is much longer than of the processes with highest priority (in the top part
of the diagram).
5.1.4
Minimal-price vs. Constant-price Strategy
In the third set of experiments, we compared two strategies, a minimal-price and a constantprice one, with the following parameters:
1. Minimal-price strategy: bidding prices are increased or decreased by 0.5, the price of
the migration is 1, and the maximum bidding price is 15.
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
29
2. Constant-price strategy: The initial price is 3 and the host price is 5.
On each of ten detached nodes 20 process managers were run that were randomly
assigned one of these 2 strategies. Each process manager took care of 1 process with
random running time from the interval 30 to 50 seconds.
Fig. 5.4 shows the results we measured. It turned out that the minimal strategy adapts
on the market and overbids the constant one, minimizing the execution time of the processes
with this strategy (Fig. 5.4(a)), as we expected. The constant strategy wins the auction
only when the demand from the minimal strategies was less than available processors,
what can be seen in Fig. 5.4(b). Later, the diagram shows the development after the
demand from the minimal-price strategy process managers has fallen down and therefore
the constant-price strategies started to be successful in the bidding.
Fig 5.4(c) demonstrates the time development of average processor price. It illustrates
the ability of adapting from the side of the minimal-price strategy. Initially, the demand
for the processors from the side of minimal-price strategy managers is high and the only
successful strategies are minimal-price one. They pushed the price to the maximum. But
later, in the second half of the experiment, only few process managers with this strategy
remained active, they overbid each other only with the minimal-price rate and therefore,
the price they paid fell down to the minimal-price one. On the contrary, the constant-price
strategy process managers have paid higher prices.
The execution of individual processes is shown in the execution diagram showed by
Fig 5.4(d). It shows the length of the phase, when process is executed on host processor
(the solid line) and the phase when it is not (the striped line). The diagram clearly indicates
that the bidding phase of the processes with constant-price strategy (in the top part of the
diagram) is much longer than of the processes with minimal-price strategy (in the bottom
part of the diagram).
5.1.5
Minimal-price vs. Constant vs. Escalating Strategy
In the fourth set of experiments we have compared three strategies — a minimal, a constant
and an escalating one, with the following parameters:
1. Minimal strategy (see Section 4.3.3) with both increase and decrease price set to 0.5
and the price of the migration evaluated to 1. The maximum price of bid is set to
15.
2. Escalating strategy with maximal price set to 30, escalating level set to 1.5, and the
start bid price set to 1.
3. Constant strategy with initial price 3 and host price 5.
On each of ten processors we randomly run 20 processes with random priority and
running times from 20 to 40 seconds.
It is clear, at the first glance from Fig. 5.5(c), that the escalating became very successful. It is not very surprising because it has higher bounder for prices. The closer look at
30
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
all processes
60
Minimal-price strategy
Constant-price strategy
50
40
30
20
10
0
executing processes
0
400
600
800
1000
time [s]
(a) The processes executed in the system.
10
1200
1400
Minimal-price strategy
Constant-price strategy
8
6
4
2
0
0
average cpu price
200
200
400
600
800
1000
1200
time [s]
(b) The number of currently running processes over the time.
18
16
14
12
10
8
6
4
2
0
1400
Minimal-price strategy
Constant-price strategy
0
200
400
600
800
1000
1200
time [s]
(c) The average price of processors over the time.
1400
31
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
160
waiting
executing
constant
strategy
140
100
80
60
minimal
strategy
process id
120
40
20
0
0
200
400
600
800
1000
1200
1400
time [s]
(d)
Behavior of individual processes.
Figure 5.4: The competition of a constant-price strategy and a minimal-price strategy.
32
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
70
minimal strategy
constant strategy
escalating strategy
all processes
60
50
40
30
20
10
0
0
500
1000
(a)
1500
2000
time [s]
2500
3000
3500
4000
The processes executed in the system.
Fig. 5.5(a) reveals that the minimal bidding was very successful in determining the market
price but after it reaches the maximum price boundary, it was quickly overbid by escalating strategy. Afterwards the escalating strategies end their execution it becomes the
competition between the minimal and constant strategies, which has predictable progress
similar to that one described in the previous experiment.
Fig. 5.5(d) reveals that success of escalating strategy is caused by very high price it
offers. On the other hand, the minimal-price strategy pays only a little above a price paid
by a constant-price strategy.
The execution of individual processes is shown in the execution diagram showed by
Fig 5.5(b). It shows the length of the phase, when process is executed on a host processor
(the solid line) and the phase when it is not (the striped line). The diagram clearly
indicates that the bidding phase of the processes with escalating strategy (in the top part
of the diagram) is the shortest one, while the processes with constant-price strategy (in
the middle part of the diagram) has the longest bidding phase.
5.1.6
Escalating Strategy vs. Minimal Strategy
In the next set of experiments we compare two strategies — an escalating and a minimal
one, with the following parameters:
1. Escalating strategy with maximal price set to 30, escalating level set to 1.5, and the
start bid price set to 1.
2. Minimal strategy(see section 4.3.3) with both increase and decrease price set to 0.5
and the price of the migration evaluated to 1. The maximum price the strategy bid
is set to 15.
On the four out of ten processors we run 20 processes with random strategies and with
execution time between 40 and 60 seconds.
33
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
200
waiting
executing
180
160
140
process id
120
100
80
60
40
20
0
0
500
1000
(b)
1500
2000
time [s]
2500
Behavior of individual processes.
3000
3500
4000
34
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
executing processes
10
minimal strategy
constant strategy
escalating strategy
8
6
4
2
0
0
500
1000
1500
2000
2500
3000
3500
4000
time [s]
(c)
The processes running on an available processor.
average cpu price
35
minimal strategy
constant strategy
escalating strategy
30
25
20
15
10
5
0
0
500
1000
1500
2000
2500
3000
3500
4000
time [s]
(d)
The average processor price paid by a strategy.
Figure 5.5: The competition of minimal-price, constant-price and escalating strategies.
35
escalating strategy
minimal strategy
all processes
30
25
20
15
10
5
0
0
200
(a)
400
600
time [s]
800
The processes executed in the system.
1000
1200
35
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
executing processes
10
escalating strategy
minimal strategy
8
6
4
2
0
0
200
400
600
800
1000
1200
time [s]
(b)
The processes running on an available processor.
100
waiting
executing
90
80
process id
70
60
50
40
30
20
10
0
0
200
400
600
800
time [s]
(c)
Behavior of individual processes.
1000
1200
36
average cpu price
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
80
70
60
50
40
30
20
10
0
escalating strategy
minimal strategy
0
200
(d)
400
600
time [s]
800
1000
1200
The average processor price paid by the strategy.
Figure 5.6: The competition of escalating and minimal-price strategies.
Fig. 5.6(b) shows that the escalating became very successful. It is not too surprising
because it has higher bounder for prices. The closer look at Fig. 5.6(b) will reveal to us
that the minimal bidding was very successful in determining the market price but after it
reaches the maximum price boundary, it was quickly overbid by escalating strategy.
Fig. 5.6(d) shows the prices which an escalating strategy used to gain its success. It is
much higher than costs paid by a minimal-strategy.
The execution of individual processes is shown in the execution diagram showed by
Fig 5.6(c). It shows the length of the phase, when process is executed on host processor
(the solid line) and the phase when it is not (the striped line). The diagram clearly indicates
that the bidding phase of the processes with minimal-price strategy (in the top part of the
diagram) is longer than of the processes with escalating-price strategy (in the bottom part
of the diagram).
5.1.7
Escalating vs. Constant Strategy
In the next set of experiments we compare three strategies — a minimal, a constant and
an escalating one, with the following parameters:
1. Escalating strategy with maximal price set to 30, escalating level set to 1.5, and the
start bid price set to 1.
2. Constant strategy with initial price 3 and host price 5.
On the four out of ten processors, we run 20 processes with random strategies and with
execution time between 40 and 60 seconds.
Fig. 5.7(b) shows that the processes using the escalating strategy executes very, without
long waiting for an available processor (Fig. 5.7(d)). The processes with constant strategy
start executing only after the escalating one finishes.
37
all processes
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
40
35
30
25
20
15
10
5
0
escalating strategy
constant strategy
0
200
(a)
400
600
time [s]
1000
1200
The processes executed in the system.
10
executing processes
800
escalating strategy
constant strategy
8
6
4
2
0
0
200
400
600
800
1000
1200
time [s]
(b)
The processes running on an available processor.
average cpu price
70
escalating strategy
constant strategy
60
50
40
30
20
10
0
0
200
(c)
400
600
time [s]
800
1000
The average processor price paid by the strategy.
1200
38
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
100
waiting
executing
90
80
process id
70
60
50
40
30
20
10
0
0
200
400
(d)
600
time [s]
800
1000
Behavior of individual processes.
Figure 5.7: The competition of escalating and constant-price strategy.
1200
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
39
Fig. 5.7(c) shows the prices which an escalating strategy used to gain its success. The
price quickly raise up to the maximum constant until there are processes with escalating
strategy that are overbidding each other.
5.1.8
Computational Overhead of the Auction Mechanisms
process managers
CPU load [%]
50 100 150 200 250 300 350 400
3
4
7
9
11 15 21 31
Table 5.1: The dependence of the core node processor load due to the auction computation
on the number of bidding process managers in the system.
We made several experiments to understand the dependence of the auction execution
complexity on the number of bidding process managers in the system (see Table 5.1). Even
for the 300 active bidders which is in fact a quite high number, the processor load is very
low, below 15%, which we believe is a very good result. Moreover, we assume that it can
be further reduced by optimizing the implementation.
5.2
Auditing Mechanism
In our experiments, we used a problem of RSA numbers factorization for auditing processes.
The factorization of RSA numbers has pleasant characteristics:
• new process generations is as simple as multiplication of two proper prime numbers,
• factorization of a number can have long execution time,
• an execution time of a factorization can be precisely predicted.
In the next experiment, we show an ability of an auditor to predict a future resource
assessment, using information reported by a processor manager.
Fig. 5.8 shows a situation where a processor manager tried to precisely predict resource
load based on its own history. We can see that its prediction R(t) is very close to the real
values and therefore also the auditor assessment is very close to real values. A seller would
get as much money as a resource is worth of and a buyer gets a performance he paid for.
On the other hand, in Fig. 5.9, we can see a situation where a processor manager does
not predict its resource performance, thus referring only a constant value as a processor
performance prediction. In this case, only the measured values are considered and therefore,
in case of quick changes, we can see high peeks, caused by long history h, we take into
consideration. When finally a performance leap gets out of a history, an assessment stabilize
to a real value, again.
In a market environment, it is quite probable that a malicious processor manager,
thirsty for a higher profit, will report a false performance. We simulate it by a random
40
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
R(t)
A(t)
L(t)
performance
1
0.8
0.6
0.4
0.2
0
0
5000
10000
15000
20000
25000
time [s]
Figure 5.8: Resource prediction R(t) represents quite good prediction of a resource performance which is close to the measured value A(t). Auditor results L(t) are therefore close
to real value, from which it varies only when a reported prediction R(t) fails.
R(t)
A(t)
L(t)
performance
1
0.8
0.6
0.4
0.2
0
0
5000
10000
15000
20000
25000
time [s]
Figure 5.9: Processor manager does not predict a resource performance. It can be represented by constant R(t) function. Results from auditing L(t) are computed only from
measured values A(t).
41
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
R(t)
A(t)
L(t)
performance
1
0.8
0.6
0.4
0.2
0
0
5000
10000
15000
20000
25000
time [s]
Figure 5.10: To gain more profit, a malicious processor manager can try to report a false
performance, which can be, for simplicity, represented by a random function R(t). Because
a standard deviation between audit measurements A(t) and R(t), an audit results in a very
low assessment L(t).
performance prediction, which is described in Fig. 5.10. Using audit measurements, an
auditor finds out that a standard deviation of a reported performance is very high. Therefore, an assessment will be very low. Therefore, such behavior does not bring to a user any
added profit.
R(t)
A(t)
L(t)
performance
1
0.8
0.6
0.4
0.2
0
0
5000
10000
15000
20000
25000
time [s]
Figure 5.11: Processor manager predicts R(t) always an oposite value (1-A(t)) to real
performance values A(t). An audit results a very poor assessment L(t).
In Fig. 5.11, a most extreme situation is described. A processor manager always predicts
an inverse value to the real one. This leads to a very poor audit assessment because of a
very high deviation between referred and real values.
We see that such auditing reports seem to be quite usefull because it advantages sellers
that cooperate with buyers and report quite precise performance predictions, over those
ones that try to cheat a buyer and try to sell its resource with more profit than it is worth
of.
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
42
In the final experiment, we run a non-dedicated cluster with auction-based economic
framework to measure the price of auditing mechanism. We run two heterogeneous nodes,
one with Intel Pentium 4 3.2GHz processor and one with Intel Celeron 900MHz processor.
The framework runs with a central auction which is the only way to buy a processor time.
We run processor-intensive processes that bid for available processors to run on them.
In the first case, we did not use any auditing mechanism, i.e., process managers did not
have any mechanism to compare individual processors.
10
Celeron
Pentium 4
price
8
6
4
2
0
0
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
time [s]
Figure 5.12: Prices resulting from an auction when no auditors were integrated.
Fig. 5.12 describes prices which were coming out from an auction. Prices are the same
for both processors. This is an obvious disadvantage for those process managers who
accidentally bought that Intel Celeron processor.
10
Celeron
Pentium 4
price
8
6
4
2
0
0
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
time [s]
Figure 5.13: Prices resulting from an auction when auditing mechanism was integrated.
Therefore, in the second case, we integrate an audit mechanism into process managers,
so that they execute every 140 seconds one auditing process. Fig. 5.13 indicates that in
this case, the Intel Celeron was sold for much lower price because of lower performance. It
can be seen that using auditors can save a lot of expenses. But usage of auditors is not for
43
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
free, the execution of auditing processes must be payed as well as ordinary computation
processes so the profit caused by using audits can be defined as:
pn (1−ta fa )
= ppna (1 − ta fa ),
pa
where
pa is processor market price in environment where auditors appear;
pn is processor market price when no auditors where used;
fa is an executing frequency of auditing processes; and
ta is an average execution time of an auditing process.
This equation indicates that using auditors does not directly mean saving of expenses.
If pn is similar to pa , i.e., if the performance characteristics of all computers in a cluster
are very similar, then the auditing mechanism would increase the expenses. However, in
practice, such a situation will occur very rarely in heterogeneous environments like nondedicated clusters.
In our configuration where auditing process is executed every 140 seconds and runs
around 3 seconds, a profit can be computed as:
.
1
2(1 − 3 140
) = 1.96.
Auditing can save a lot of expenses to process managers.
5.3
Distributed Architecture
We conducted several experiments to inspect the usability and efficiency of the implemented
prototype. We compared various auction mechanisms and bidding strategies and chose
some of those to demonstrate in this thesis.
5.3.1
Bidding Strategies
average cpu price
We compared two bidding strategies, a minimal-price and a constant-price one.
18
16
14
12
10
8
6
4
2
0
Minimal-price strategy
Constant-price strategy
0
200
400
600
800
time [s]
1000
1200
Figure 5.14: The average price of processors over the time.
1400
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
44
The constant-price bidding strategy is very simple. Initially, a process manager bids
with a constant price defined at the startup and after winning a first processor, it uses a
second, higher price. On the other hand, the minimal-price bidding strategy is a bit more
complicated, it has 3 parameters. When a process manager has not won a processor yet,
it chooses as the next bidding price the minimal winning processor price from the previous
auction round, increased by the priority of the process and uses it to bid for all available
processors. The situation changes when a process manager wins an auction and gains
some processor. In the next auction round, it continues to bid for this processor using the
winning price increased by the cost of the migration. Besides that, it also continues to
bid for all other processors, for the case that some cheaper processor appears. It bids for
them with the price decreased by a constant number that is defined as a parameter of the
strategy. Bidding prices are limited to some maximum value.
For experiments, we used a cluster of 26 nodes (22 were Intel Pentium IV 3.2GHz and
4 of them were Intel Core Duo 2.4GHz). We have chosen strategies with the following
parameters:
1. The constant-price strategy: The initial price is 3 and the second price is 5.
2. The minimal-price strategy: The priority is 0.5, the migration cost is 1, and the
maximum bidding price is 15.
On every node, we randomly ran the processes with running times from 30 to 50 seconds.
Fig. 5.14 demonstrates the time development of average processor price. It illustrates
better adaptability of the minimal-price strategy. Initially, the demand for the processors from minimal-price-strategy managers is high and the only successful strategy is the
minimal-price one. The process managers pushed the price to the maximum. But later, in
the second half of the experiment, only few process managers with this strategy remained
active, they overbid each other only with the minimal-price rate and therefore, the price
they paid fell down to the minimal-price one. On the contrary, the constant-price strategy
process managers have paid higher prices.
5.3.2
Scalability and Heterogeneity
In the next experiment, we tested a system consisting of 2 independent subclusters (local
markets). One is a cluster of 22 homogeneous nodes of Intel Pentium IV 3.2GHz. The
second is a non-dedicated cluster consisting of workstations joined by the Ethernet into
one LAN. We will refer to the first part as the cluster, while the second one as the LAN.
To simplify the scalability, we used the binary compatible systems on all computers, so
every process can be easily executed on each of the nodes.
We measured how the price (Fig. 5.3.2) will change in each of the market, while changing
the load and quality in one of the market. We systematically loaded the system by running
the process from the computer joined through the internet.
In the first phase (see Fig. 5.3.2), that took around 1000s, we slightly preferred the cluster market to simulate the real users’ decisions. Because the cluster is mostly more stable
45
price
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
18
16
14
12
10
8
6
4
2
0
average price
0
500
1000
1500
time [s]
2000
2500
3000
price
(a) The average prices of processors in a heterogeneous cluster.
18
16
14
12
10
8
6
4
2
0
average price
0
500
1000
1500
time [s]
2000
2500
(b) The average prices of processors in a LAN.
Figure 5.15: The average processor prices in different markets.
3000
46
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
than the non-dedicated LAN, we have predicted the the users would prefer it. Therefore,
the prices are slightly higher in the cluster market.
In the next phase that took from 1000 to 2000 seconds, we excluded some of the
cluster’s nodes, to simulate the internal usage of the cluster. Because of less availability of
processors, the price for each processor time quantum increased.
In the last phase that took from 2000 to 3000 seconds, we included back the missing
nodes and started to prefer the usage of the LAN processors upon the cluster ones. The
prices fell down in both markets because of the higher availability of processors. At the
same time, the prices in the LAN market were higher than in the cluster market because
of its preference.
In this experiment, we showed the scalability of the system. Although the scalability of
individual markets is limited by the cluster system (e.g., [24]) and the market mechanism
they used, the whole integrated system consisting of many markets is immensely scalable.
The heterogeneity of computing resources is essential to our system. The user only needs
to ensure the execution compatibility of the process it runs on the target processor.
evaluation
5.3.3
Robustness
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
average evaluation
0
500
1000
1500
2000
time [s]
2500
3000
3500
4000
Figure 5.16: The average values of processors in the LAN evaluated by an auditor.
In our last experiment, we measured how the malicious users affect the assessment of
processors in the system. We used the same experimental configuration as in the previous
experiment. Therefore, the clusters are usually more stable and secure, so we focused only
on characteristics of LAN markets where the malicious users can appear quite often.
In Fig. 5.3.3, we can see the average evaluation of a processor in the LAN market. In
the first 1000s, there was a usual load, so the assessment oscillated around a relatively high
value.
In the next phase that took from 1000-2000 seconds, we executed a high number of
processes in this market. That led to rapidly lower assessment of all processors in the LAN
configuration because of its high load.
CHAPTER 5. EXPERIMENTS AND EVALUATION OF RESULTS
47
After this phase, we returned the load to the average values so the assessment oscillated
again around much higher values.
In the last phase that took from 3000-4000 seconds, we simulated a couple of malicious
users that reported bad performance characteristics. Fig. 5.3.3 illustrates that although
the load is at a usual level, the assessment fell down rapidly.
The experiment indicates that the system is robust enough to sustain also some malicious behaviors of the participants. The user that acts maliciously is disadvantaged, thus
the system creates a motivation for participants to act cooperatively.
Chapter 6
Conclusion
6.1
Fulfillment of Objectives
In this thesis we described a design of market-based ecosystem for nondedicated systems.
In the next part we discusse the objectives we defined in section 3.
6.2
Scheduling and Price Estimation
We made a few experiments to show some of the countless possibilities of price estimation. We implemented a first-price, sealed bid auction and Second-price, sealed bid auction
(Vickrey). The tasks managers that are willing for the resources to satisfy the tasks they
manage. These can negotiate with other tasks managers using the auction platform. In
general it can take some time to finish the negotiation between task and processor managers so it must be taken into consideration when planning the execution of tasks. We
showed that it may be a very efficient way to set the proper price in an environment where
the demand exceeds the offer of available resources. We showed how to implement and
integrate such platform in our architecture.
On the other side the auctions are not the only possibility to find the prize for resources.
While in auctions the task managers negotiate with other task managers, the processor
manager can set the fixed price for its resources. This favourable for high responsive tasks
that they can directly choose the processor to use without any need to negotiate.
In the environment with higher supply than demand it could make sense to rather
negotiate the price between processor managers, so they would offer the lowest price to
execute available tasks.
6.3
Monetary system
To be able to negotiate, participants must be able to make a trustful relation between each
other and to the currency that both sides decide to pay the used resources. This relation
48
CHAPTER 6. CONCLUSION
49
can emerge by more complicated third-party relations. We proposed a bank participant
that can be used to establish trusted relations between mutually untrusted participants.
The bank allows participants to trust the currency circulation.
On the other hand there must be a mechanism to build a first-hand trust between task
and processors managers. One must be able to find out that the resources a processor manager offer have the same qualities as he proclaimed. We proposed a mechanism of auditing
to show how one can create mutually-trustful relation between two selfish participants.
6.4
Migration
Nondedicated cluster CLoNDIKe offers checkpoint mechanism that we used in our design.
We demonstrated the usability of such concept. Although it shows to be an elegant and
usefull concept it is tied to the CLoNDIKe architecture. As an alternative one could use a
virtual machine mechanisms to represent a task with a virtual machine state that can be
easily checkpointed and moved from one node to another. However it is a heavy-weight
mechanism, its main advantage is relative independance on chosen architecture.
6.5
Summary
We have designed a framework for resource allocation in non-dedicated clusters. Based
on the CLoNDIKe architecture, we have built a functional prototype and have reported on
results of a several sets of experiments. All experimental results confirmed our assumptions
about the correctness of the design decisions and the correctness of the currency flow
concept for decentralized resource allocation mechanisms.
We proposed, implemented, and integrated an auditing mechanism into our marketbased framework for non-dedicated clusters. The auditing mechanism allows a user to
assess a node and detect a malicious node at once. A malicious node will loose a trust of
users and will have harder situation on a market. Assessments of all resources are related
to the same reference value, so they can be easily compared to each other.
Every participant can make its own auditing, based on the execution characteristic of
its processes. An independent auditing of all nodes used by all participants can be quite
unnecessary and expensive, therefore we expect an emergence of trust and cooperation
between various users, which would originate into new merchandized commodity on our
digital market — the auditing results. We are now examining such relations and searching
for a best platform to establish trusted relationships between cluster participants.
The experiments indicate that our ideas and proposals can be really used to create
a high-performance non-dedicated cluster shared between independent and mutually untrusted participants. In contrast to related projects like Tycoon, our architecture offers
fully two-sided economy so any user can join the cluster and use cluster resources or offer
its resources.
CHAPTER 6. CONCLUSION
50
We have designed and implemented a 2-level market-based integrated processor allocation system that may consist of local markets cooperating together. Each of the markets can contain individual trading mechanisms, micropayment infrastructure, or auditing
mechanisms.
We sketched the features of the resulting system. We made a series of experiments and
we presented results of some of them to explain how those features were achieved.
We showed the flexibility of the system. The user can use various market strategies.
Market mechanisms in every local market can be different. The different markets can be
connected together to create a scalable global digital economy. Each of the markets can
contain heterogeneous computing resources. The only requirement which the users must
satisfy is the execution compatibility of the processes and the target processors. We showed
that the system is robust enough to sustain the malicious behavior of some participants.
Our solution is a basis for the further research in this area of resources utilization. It
has provided mechanisms that can be used for general and efficient Internet computing.
In this thesis, we have considered the usage of an auction mechanism for trading, but
other mechanisms can be used as well. This is the work of further research. Also, we
have implemented only the support for trading processor time quanta but the system can
support allocation of any computer resources.
6.6
Future Work
There are still many areas, which should be enhanced and explored in a more detailed way.
6.6.1
Other Kinds of Resources
In this thesis we describe only the algorithms for allocating the processors. However, most
of the algorithm could be modified to be used for other kinds of resources like memory,
disk storage, graphical processor unit, and etc.
6.6.2
Non-dedicated Cluster Platform
To use the architecture, proposed in this thesis, in real life, there is still need to enhance
the functionalities of the underlying non-dedicated cluster. We need mechanisms that will
be more robust. We need an implementation that is able to do a migration of processes
with open communication links and implements executing transactions, so a process can
be reverted and restarted from any of previous times.
6.6.3
Security and Trustworthy
Our market environment consists of parties which are mutually untrusted, each of them
following their own interests. Although we showed how a mutual trust can be established,
CHAPTER 6. CONCLUSION
51
there is still a huge area for research to examine how the more global ecosystem can be
established.
We believe that a lot of complex systems of relationships will emerge, such as market
of auditing results.
6.6.4
Other Market Mechanisms
In this thesis we use mainly an auction mechanism to allocate the resources. Although our
framework defines principles that can be used even for other market mechanisms, there is
still a huge area of other market mechanisms, like commodity exchange, to be examined.
Bibliography
[1] Planetlab: Version 3.0. Technical Report PDN–04–023, PlanetLab Consortium, October 2004.
[2] Keith Adams and Ole Agesen. A comparison of software and hardware techniques for
x86 virtualization. In ASPLOS ’06: Proceedings of the 12th international conference
on Architectural support for programming languages and operating systems, pages 2–
13, New York, NY, USA, 2006. ACM Press.
[3] Amazon.com, inc. amazon
http://aws.amazon.com/ec2.
elastic
compute
cloud
(amazon
ec2),
2007.
[4] Bo An, Chunyan Miao, and Zhiqi Shen. Market-based resource allocation with incomplete information. 2007.
[5] David P. Anderson. Boinc: A system for public-resource computing and storage. In
5th IEEE/ACM International Workshop on Grid Computing, November 2004.
[6] Amnon Barak, Oren La’adan, and Amnon Shiloh. Scalable cluster computing with
MOSIX for LINUX, 1999.
[7] James Broberg, Srikumar Vnugopal, and Rajkumar Buyya. Market-oriented grids and
utility computing: The state-of-the-art and future directions. 2007.
[8] Rajkumar Buyya, David Abramson, and Jonathan Giddy. Nimrod-g resource broker
for service-oriented grid computing. In IEEE Distributed Systems Online, volume 2,
November 2001.
[9] Rajkumar Buyya, David Abramson, Jonathan Giddy, and Heinz Stockinger. Economic
models for resource management and scheduling in grid computing. In Special Issue
on Grid Computing Environment, Concurrency and Computation: Practice and Experience (CCPE) Journal, volume 14, pages 1507–1542. Wiley Press, USA, December
2002.
[10] Rajkumar Buyya, David Abramson, and Srikumar Venugopal. The grid economy. In
Special Issue on Grid Computing, volume 93, pages 698–714. IEEE Press, New York,
USA, March 2005.
52
BIBLIOGRAPHY
53
[11] Franck Cappello, Samir Djilali, Gilles Fedak, Thomas Herault, Frederic Magniette,
Vincent Neri, and Oleg Lodygensky. Computing on large scale distributed systems:
Xtermweb architecture, programming models, security, tests and convergence with
grid. In Future Generation Computer Science, 2004.
[12] Andrew Chien, Brad Calder, Stephen Elbert, and Karan Bhatia. Entropia: architecture and performance of an enterprise desktop grid system. In Journal of Parallel and
Distributed Computing, volume 63, pages 597–610. Academic Press, 2003.
[13] Brent N. Chun, Philip Buonadonna, Alvin AuYoung, Chaki Ng, David C. Parkes,
Jeffrey Shneidman, Alex C. Snoeren, and Amin Vahdat. Mirage: A microeconomic
resource allocation system for SensorNet testbeds. In Proceedings of 2nd IEEE Workshop on Embedded Networked Sensors (EmNetsII), 2005.
[14] Anubhav Das and Daniel Grosu. Combinatorial auction-based protocols for resource
allocation in grids. In Proceedings of the 19th IEE International Parallel and Distributed Processing Symposium, 2005.
[15] K. Eric Drexler and Mark S. Miller. Incentive engineering: for computational resource
management. In Bernardo A. Huberman, editor, The Ecology of Computation, pages
231–266. Elsevier Science Publishers, North-Holland, 1988.
[16] Donal Ferguson, Yechiam Yemini, and Christos Nikolaou. Microeconomic algorithms
for load balancing in distributed computer systems. In Proceedings of 8the Internation
Conference on Distributed Computer Systems, pages 491–499, June 1988.
[17] Donald Francis Ferguson. The application of microeconomics to the design of resource
allocation and control algorithms. PhD thesis, Columbia University, 1989.
[18] Yun Fu, Jeffrey Chase, Brent Chun, Stephen Schwab, and Amin Vahdat. SHARP: an
architecture for secure resource peering. In SOSP ’03: Proceedings of the nineteenth
ACM symposium on Operating systems principles, pages 133–148, New York, NY,
USA, 2003. ACM Press.
[19] Grid computing info centre. http://www.gridcomputing.com/.
[20] Carl Hewitt. The challenge of open systems. Byte, 10:223–242, April 1985.
[21] Carl Hewitt. Dynamics of computational ecosystems. Physical Review A, 40:404–421,
1989.
[22] Bernardo A. Huberman and Tad Hogg. The behavior of computational ecologies. The
Ecology of Computation, pages 77–115, 1988.
[23] P. Kacsuk, N. Podhorszki, and T. Kiss. Scalable desktop grid system. Technical
Report TR-0006, Institute on System Architecture, May 2005.
BIBLIOGRAPHY
54
[24] M. Kačer and P. Tvrdı́k. Clondike: Linux cluster of non-dedicated workstations. In
Proceedings of the Fifth IEEE International Symposium on Cluster Computing and
the Grid, May 2005.
[25] Kevin Lai, Bernardo A. Huberman, and Leslie Fine. Tycoon: A distributed marketbased resource allocation system. Technical Report arXiv:cs.DC/0404013, HP Labs,
Palo Alto, CA, USA, April 2004.
[26] Kevin Lai, Lars Rasmusson, Eytan Adar, Stephen Sorkin, Li Zhang, and Bernardo A.
Huberman. Tycoon: an implemention of a distributed market-based resource allocation system. Technical Report arXiv:cs.DC/0412038, HP Labs, Palo Alto, CA, USA,
December 2004.
[27] Stefan M. Larson, Christopher D. Snow, Michael Shirts, and Vijay S. Pande. Folding@home and genome@home: Using distributed computing to tackle previously intractable problems in computational biology. Computational Genomics, 2002.
[28] M.J. Litzkow, M. Livny, and M.W. Mutka. Condor - a hunter of idle workstations.
In Proceedings of the 8th International Conference of Distributed Computing Systems,
pages 104–111, June 1988.
[29] T. W. Malone, R. E. Fikes, K. R. Grant, and M. T. Howard. Enterprise: A market-like
task scheduler for distributed computing environments. The Ecology of Computation,
pages 177–205, 1988.
[30] Silvio Micali and Ronald L. Rivest. Micropayments revisited. In CT-RSA ’02: Proceedings of the The Cryptographer’s Track at the RSA Conference on Topics in Cryptology,
pages 149–163, London, UK, 2002. Springer-Verlag.
[31] Mark S. Miller and K. Eric Drexler. Markets and computation: Agoric open systems.
The Ecology of ComputationA, 40:133–176, 1988.
[32] openMosix project. http://openmosix.sourceforge.net/.
[33] Openssi clusters for Linux. http://www.openssi.org/.
[34] PlanetLab. http://www.planet-lab.org.
[35] O. Regev and N. Nisan. The popcorn market – an online market for computational
resources. In Proceedings of the 1st International Conference on Information and
Computation Economies, October 1998.
[36] Kate Reynolds.
Going, going, gone!
a survey of auction types, 1996.
http://www.agorics.com/library/auctions.html/.
[37] Luis F. G. Sarmenta. Bayanihan: Web-based volunteer computing using java. In
Lecture Notes in Computer Science 1368, pages 444–461. Springer-Verlag, March 1998.
BIBLIOGRAPHY
55
[38] Seti@home. http://setiathome.ssl.berkeley.edu/.
[39] Reid G. Smith. The contract net protocol: High-level communication and control in
a distributed problem solver. 1980.
[40] Zhu Tan. Market-Based Grid Resource Allocation Using a Stable Continuous Double
Auction. PhD thesis, University of Manchester, 2007.
[41] Todd Tannenbaum, Derek Wright, Karen Miller, and Miron Livny. Condor—a distributed job scheduler. In Thomas Sterling, editor, Beowulf Cluster Computing with
Linux. MIT Press, October 2001.
[42] M. Taufer, D. Anderson, P. Cicotti, and C. L. Brooks III. Homogeneous redundancy:
a technique to ensure integrity of molecular simulation results using public computing.
International Parallel and Distributed Processing Symposium, 02:119a, 2005.
[43] Douglas Thain, Todd Tannenbaum, and Miron Livny. Distributed computing in practice: The condor experience. In Concurrency and Computation: Practice and Experience, number 2-4, pages 323–356, February-April 2005.
[44] Jan Čapek. Preemptive process migration in a cluster of non-dedicated workstations.
Master’s thesis, Czech Technical University, Prague, 2005.
[45] Carl A. Waldspurger, Tad Hogg, Bernardo A. Huberman, Jeffrey O. Kephart, and
W. Scott Stornetta. Spawn: A distributed computational economy. IEEE Transactions on Software Engineering, 18(2):103–117, February 1992.
[46] Rich Wolski, James S. Plank, Todd Bryan, and John Brevik. G-commerce: Market
formulations controlling resource allocation on the computational grid. In Proceedings
of the 15th IEE International Parallel and Distributed Processing Symposium, 2001.
[47] Jia Yuan Yu and Shie Mannor. Eficiency of Market-Based Resource Allocation Among
Many Participants. 2007.
Publications of the Author
[A.1] M. Košťál, P. Tvrdı́k. Evaluation of heterogeneous nodes in a nondedicated cluster.
Proceedings of the 18th IASTED International Conference of Parallel and Distributed
Computing and Systems, pages 335-340, Dallas, Texas, USA, 2006.
[A.2] M. Košťál, P. Tvrdı́k. System for trading processors quanta on a digital market.
Proceedings of the ISCA 20th International Conference on Parallel and Distributed
Computing Systems, pages 133-138, ISCA, Las Vegas, Nevada, USA, 2007.
[A.3] M. Košťál, P. Tvrdı́k. Market-based Batch System. Proceedings of the the 3rd
International Conference on Digital Society, pages 202-207, Cancun, Mexico, 2009.
[A.4] M. Košťál, P. Tvrdı́k. Digital market for non-dedicated cluster. Proceedings of the
2nd International Workshop on Grid Computing for Complex Problems, pages 96103, Bratislava, Slovak Republic, 2006.
[A.5] M. Košťál. Digital Market for Non-dedicated Clusters. Ph.D. Minimum Thesis,
Faculty of Information Technology, Prague, Czech Republic, 2006.
56