Cluster Computing Presentation.ppt

ABSTRACT
Very often applications need more computing power than a sequential computer
can provide. One way of overcoming this limitation is to improve the operating
speed of processors and other components so that they can offer the power required
by computationally intensive applications. Even though this is currently possible to
certain extent, future improvements are constrained by the speed of light,
thermodynamic laws, and the high financial costs for processor fabrication. A
viable and cost-effective alternative solution is to connect multiple processors
together and coordinate their computational efforts. The resulting systems are
popularly known as parallel computers, and they allow the sharing of a
computational task among multiple processors
INTRODUCTION
The needs and expectations of modern-day applications are changing in the sense
that they not only need computing resources (be they processing power, memory or
disk space), but also the ability to remain available to service user requests almost
constantly 24 hours a day and 365 days a year. These needs and expectations of
today’s applications result in challenging research and development efforts in both
the areas of computer hardware and software.
It seems that as applications evolve they inevitably consume more and more
computing resources. To some extent we can overcome these limitations. For
example, we can create faster processors and install larger memories. But future
improvements are constrained by a number of factors, including physical ones,
such as the speed of light and the constraints imposed by various thermodynamic
laws, as well as financial ones, such as the huge investment needed to fabricate
new processors and integrated circuits. The obvious solution to overcoming these
problems is to connect multiple processors and systems together and coordinate
their efforts. The resulting systems are popularly known as parallel computersand
they allow the sharing of a computational task among multiple processors.
Parallel supercomputers have been in the mainstream of high-performance
computing for the last ten years. However, their popularity is waning. The reasons
for this decline are many, but include being expensive to purchase and run,
potentially difficult to program, slow to evolve in the face of emerging hardware
technologies, and difficult to upgrade without,
generally, replacing the whole system. The decline of the dedicated parallel
supercomputer has been compounded by the emergence of commodity-off-theshelf clusters of PCs and workstations. The idea of the cluster is not new, but
certain recent technical capabilities,
particularly in the area of networking, have brought this class of machine to the
vanguard as a platform to run all types of parallel and distributed applications.
The emergence of cluster platforms was driven by a number of academic projects,
such as Beowulf [1], Berkeley NOW [2], and HPVM [3]. These projects helped to
prove the advantage of clusters over other traditional platforms. Some of these
advantages included, low-entry costs to access supercomputing-level performance,
the ability to track technologies, an incrementally upgradeable system, an open
source development platform, and not being locked into particular vendor
products.
Today, the overwhelming price/performance advantage of this type of platform
over other proprietary ones, as well as the other key benefits mentioned earlier,
means that clusters have infiltrated not only the traditional science and engineering
marketplaces for research and development, but also the huge commercial
marketplaces of commerce and industry. It should be noted that this class of
machine is not only being used as for high-performance computation, but
increasingly as a platform to provide highly available services, for applications
such Web and database servers.
A cluster is a type of parallel or distributed computer system, which consists of a
collection of inter-connected stand-alone computers working together as a single
integrated computing resource.
HISTORY OF CLUSTER COMPUTING
The history of cluster computing is best captured by a footnote in Greg Pfister's In
Search of Clusters: “Virtually every press release from DEC mentioning clusters
says ‘DEC, who invented clusters...’. IBM did not invent them either. Customers
invented clusters, as soon as they could not fit all their work on one computer, or
needed a backup. The date of the first is unknown, but it would be surprising if it
was not in the 1960s, or even late 1950s.”
The formal engineering basis of cluster computing as a means of doing parallel
work of any sort was arguably invented by Gene Amdahl of IBM, who in 1967
published what has come to be regarded as the seminal paper on parallel
processing: Amdahl's Law. Amdahl's Law describes mathematically the speedup
one can expect from parallelizing any given otherwise serially performed task on a
parallel architecture. This article defined the engineering basis for both
multiprocessor computing and cluster computing, where the primary differentiator
is whether or not the interprocessor communications are supported "inside" the
computer (on for example a customized internal communications bus or network)
or "outside" the computer on a commodity network.
Consequently the history of early computer clusters is more or less directly tied
into the history of early networks, as one of the primary motivation for the
development of a network was to link computing resources, creating a de facto
computer cluster. Packet switching networks were conceptually invented by the
RAND corporation in 1962. Using the concept of a packet switched network, the
ARPANET project succeeded in creating in 1969 what was arguably the world's
first commodity-network based computer cluster by linking four different
computer centers (each of which was something of a "cluster" in its own right, but
probably not a commodity cluster). The ARPANET project grew into the
Internet—which can be thought of as "the mother of all computer clusters" (as the
union of nearly all of the compute resources, including clusters, that happen to be
connected). It also established the paradigm in use by all computer clusters in the
world today—the use of packet-switched networks to perform interprocessor
communications between processor (sets) located in otherwise disconnected
frames.
The development of customer-built and research clusters proceeded hand in hand
with that of both networks and the Unix operating system from the early 1970s, as
both TCP/IP and the Xerox PARC project created and formalized protocols for
network-based communications. The Hydra operating system was built for a
cluster of DEC PDP-11 minicomputers called C.mmp at Carnegie Mellon
University in 1971. However, it was not until circa 1983 that the protocols and
tools for easily doing remote job distribution and file sharing were defined (largely
within the context of BSD Unix, as implemented by Sun Microsystems) and hence
became generally available commercially, along with a shared filesystem.
ARCHITECTURE OF A CLUSTER
The typical architecture of a cluster computer is shown in Figure 1. The
key components of a cluster include, multiple standalone computers (PCs,
Workstations, or SMPs), an operating systems, a high performance interconnect,
communication software, middleware, and applications.
Figure 1. A Cluster Architecture.
DESIGNING A CLUSTER COMPUTER
- Choosing a processor
The first step in designing a cluster is to choose the building block. The processing
power, memory, and disk space of each node as well as the communication
bandwidth between the nodes are all factors that can be chosen. You will need to
decide which are important based on the mixture of applications you intend to run
on the cluster, and the amount of money you have to spend.
Best performance for the price ==> PC (currently dual-Xeon systems)
If maximizing memory and/or disk is important, choose faster workstations
For maximum bandwidth, more expensive workstations may be needed
PCs running Linux are by far the most common choice. They provide the best
performance for the price at the moment, providing good CPU speed with cheap
memory and disk space. They have smaller L2 cache sizes than some more
expensive workstations which can limit the SMP performance. They have less
main memory bandwidth which can limit the performance for applications that do
not reuse data cache well. The availability of 64-bit PCI-X slots and memory upto
16 GBytes removes several bottlenecks, but new 64-bit architectures will still
perform better for large-memory applications.
For applications that require more networking than Gigabit Ethernet can provide,
more expensive workstations may be the way to go. You will have fewer but faster
nodes, requiring less overall communications, plus the memory subsystem can
support communication rates upwards of 200-800 MB/sec.
When in doubt, it is always a good idea to benchmark your code on the machines
that you are considering. If that is not possible, there are many generic benchmarks
that you can look at to help you decide. The HINT benchmark developed at the
SCL, or a similar benchmark based on the DAXPY kernel shown below, show the
performance of each processor for a range of problem sizes.
If your application uses little memory, or heavily reuses data cache, it will operate
mainly on the left side of the graph. Here the clock rate is important, and the
compiler choice can make a big difference. If your application is large and does not
reuse data much, the right side will be more representative and the memory speed
will be the dominate factor.
- Designing the network
Along with the basic building block, you will need to choose the fabric that
connects the nodes. As explained above, this depends greatly on the applications
you intend to run, the processors you choose, and how much money you have to
spend.
Gigabit Ethernet is clearly the cheapest. If your application can function with a
lower level of communication, this is cheap, reliable, but scales only to around 14
nodes using a flat switch (completely connected cluster, no topology).
- Which OS?
The choice of an OS is largely dictated by the machine that you choose. Linux is
always an option on any machine, and is the most common choice. Many of the
cluster computing tools were developed under Linux. Linux, and many compilers
that run on it, are also available free.
With all that being said, there are PC clusters running Windows NT, IBM clusters
running AIX, and we have even built a G4 cluster running Linux.
- Loading up the software
I would recommend choosing one MPI implementation and going with that. PVM
is still around, but MPI the way to go (IMHO). LAM/MPI is distributed as an RPM
so it is easiest to install. It also performs reasonably well on clusters.
There are many free compilers available, and the availability will of course depend
on the OS you choose. For PCs running Linux, the GNU compilers are acceptible.
The Intel compilers provide better performance in most cases for the Intel
processors, and pricing is reasonable. The Intel or PGI compilers may help on the
AMD processors. However, the cluster licenses for the PGI compilers are
prohibitively expensive at this point. For Linux on the Alpha processors, Compaq
freely distributes the same compilers that are available under Tru64 Unix.
There are also many parallel libraries such as ScaLAPACK available. For Linux
PCs, you may also want to install a BLAS library like the Intel MKL or one Sandia
developed.
If you have many users on a cluster, it may be worthwhile to put on a queueing
system. PBS (portable batch system) is currently the most advanced, and is under
heavy development. DQS can also handle multiprocessor jobs, but is not quite as
efficient.
You will also want users to have a quick view of the status of the cluster as a
whole. There are several status monitors freely available, such as statmon
developed locally. None are up to where I'd like them to be yet, although
commercial versions give a more active and interactive view.
- Assembling the cluster
A freestanding rack costs around $100, and can hold 16 PCs. If you want to get
fancier and reduce the footprint of your system, most machines can be ordered
with rackmount attachments.
You will also need a way to connect a keyboard and monitor to each machine for
when things go wrong. You can do this manually, or spend a little money on a
KVM (keyboard, video, mouse) switch that makes it easy to access any computer.
- Pre-built clusters
If you have no desire to assemble a system yourself, there are many vendors who
sell complete clusters to your design. These are 1U or 2U rackmounted nodes preconfigured to your specifications. They are compact, easy to setup and maintain,
and usually have good custom tools like web based status monitors. The price
really isn't too much more than building your own systems now.
- Cluster administration
With large clusters, it is common to have a dedicated master node that is the only
machine connected to the outside world. This machine then acts as the file server,
and the compile node. This provides a single-system image to the user, who
launches the jobs from the master node without ever logging into any nodes.
There are boot disks available that can help in setting up the individual nodes of a
cluster. Once the master is configured, these boot disks can be configured to
perform a complete system installation for each node over the network. Most
cluster administrators also develop other utilities, like scripts that operate on every
node in the cluster. The rdist utility can also be very helpful.
If you purchase a cluster from a vendor, it should come with software installed to
make it easy to use and maintain the system. If you build your own system, there
are some software packages available to do the same. OSCAR is a fully integrated
software bundle designed to make it easy to build a cluster. Scyld Beowulf is a
commercial package that enhances the Linux kernel providing system tools that
produce a cluster with a single system image.
If set up properly, a cluster can be relatively easy to maintain. The operations that
you would normally do on a single machine simply need to be replicated across
many machines. If you have a very large cluster, you should keep a few spare
machines to make it easy to recover from hardware problems.
HOW CLUSTER COMPUTING WORKS
The software architecture consists of a user interface layer, a scheduling layer, and
an execution layer.
The interface and scheduling layers reside on the head node. The execution layer
resides primarily on the compute nodes. The execution layer as shown here
includes the Microsoft implementation of MPI, called MS MPI, which was
developed for Windows and is included in the Microsoft® Compute Cluster Pack.
This application is based on the Argonne National Laboratories MPICH2
implementation of the MPI-2 standard.
The user interface layers consist of the Compute Cluster Job Manager, the
Compute Cluster Administrator, and Command Line Interface (CLI).
The Compute Cluster Job Manager is a WIN32 graphic user interface to the Job
Scheduler that is used for job creation and submission.
The Compute Cluster Administrator is a Microsoft Management Console (MMC)
snap-in that is used for configuration and management of the cluster.
The Command Line Interface is a standard Windows command prompt which
provides a command-line alternative to use of the Job Manager and the
Administrator.
The scheduling layer consists of the Job Scheduler, which is responsible for
queuing the jobs and tasks, reserving resources, and dispatching jobs to the
compute nodes.
In this example, the execution layer consists of the following components
replicated on each compute node: the Node Manager Service, the MS MPI
launcher mpiexec, and the MS MPI Service.
The Node Manager is a service that runs on all compute nodes in the cluster. The
Node Manager executes jobs on the node, sets task environmental variables, and
sends a heartbeat (health check) signal to the Job Scheduler at specified intervals
(the default interval is one minute).
Mpiexec is the MPICH2-compatible multithreading executable within which all
MPI tasks are run.
The MS MPI Service is responsible for starting the job tasks on the various
processors.
CONCLUSION
Network clusters offer a high-performance computing alternative to SMP and
massively parallel computing systems. Aggregate system performance aside,
cluster architectures also can lead to more reliable computer systems through
redundancy. Choosing a hardware architecture is just the beginning step in
building a useful cluster: applications, performance optimization, and system
management issues must also be handled.
REFERENCES
Bader, David; Robert Pennington (June 1996). "Cluster Computing: Applications".
Georgia Tech College of Computing. Retrieved 2007-07-13.
^ TOP500 List - June 2006 (1-100) | TOP500 Supercomputing Sites
^ Farah, Joseph (2000-12-19). "Why Iraq's buying up Sony PlayStation 2s". World
Net Daily.
^ Pfister, Gregory (1998). In Search of Clusters (2nd ed.). Upper Saddle River, NJ:
Prentice Hall PTR. p. 36. ISBN 0-13-899709-8.
^ http://www.beowulf.org/overview/history.html
^ gridMathematica Cluster Integration
^ Chari, Srini (2009). "Mastering the Odyssey of Scale from Nano to Peta: The
Smart Use of High Performance Computing (HPC) Inside IBM®". Denbury, CT:
IBM. p. 5.
Robert W. Lucke: Building Clustered Linux Systems, Prentice Hall, 2005, ISBN 013-144853-6
Evan Marcus, Hal Stern: Blueprints for High Availability: Designing Resilient
Distributed Systems, John Wiley & Sons, ISBN 0-471-35601-8
Greg Pfister: In Search of Clusters, Prentice Hall, ISBN 0-13-899709-8
Rajkumar Buyya (editor): High Performance Cluster Computing: Architectures
and Systems, Volume 1, ISBN 0-13-013784-7, Prentice Hall, NJ, USA, 1999.
Rajkumar Buyya (editor): High Performance Cluster Computing: Programming
and Applications, Volume 2, ISBN 0-13-013785-5, Prentice Hall, NJ, USA, 1999.