184_Building-Your-Own-Super-Computer

Building Your Own Super Computer ( 1 )
In this article, although a bit off topic, I will discuss how to build a generic Linux or Windows
supercomputer with the clustered computing concept. You will find out just how easy it is to
build a supercomputer with Linux clusters. In this article we will limit our discussion to
building Linux and Windows clusters to obtain supercomputer computational power. It is out
of scope of this article to discuss how to solve any computational intensive algorithmic
problem and how to code those algorithms for cluster architecture.
Definitions and Benefits of Clustering ( 2 )
Greg Pfister, in his wonderful book In Search of Clusters, defines a cluster as "a type of
parallel or distributed system that: consists of a collection of interconnected whole
computers, and is used as a single, unified computing resource".
Therefore, a cluster is a group of computers bound together into a common resource pool. A
given task can be executed on all computers or on any specific computer in the cluster. Lets
look into the benefits of clustering:



Scientific applications: Enterprises running scientific applications on supercomputers
can benefit from migrating to a more cost effective Linux cluster .
Large ISPs and E-Commerce enterprises with a large database: Internet service
providers or e-commerce web sites that require high availability and load balancing
and scalability.
Graphics rendering and animation: A Linux cluster has become important in the film
industry for rendering quality graphics. In the movie Titanic, a Linux cluster was
used to render the background in the ocean scenes. The same concept was used in
the movies True Lies and Interview with the Vampire.
We can also characterize clusters by their function:


Distributed processing cluster: Tasks (small piece of executable code) are broken
down and worked on by many small systems rather than one large system, often
deployed for a task previously handled by supercomputers. This type of cluster is
very suitable for scientific or financial analysis.
Fail-over clusters: Clusters are used to increase the availability and serviceability
of network services. When an application or server fails, its services are migrated to
another system. The identity of the failed system is also migrated. Failover servers
are used for database servers, mail servers or file servers:

High availability load balancing clusters: A given application can run on all
computers and a given computer can host multiple applications. The “outside world”
interacts with the cluster and individual computers are “hidden”. It supports large
cluster pools and applications do not need to be specialized. High availability
clustering works best with stateless application and those that can be run
concurrently:
Building Windows Clusters ( 3 )
Hardware
Before starting, you should have the following hardware and software:


At least two computers with Windows XP, Windows NT, SP6 or Windows 2000
networked with some sort of LAN equipment (hub, switch etc.).
Ensure during the Windows set up phase that TCP/IP, and NETBUI are installed,
and that the network is started with all the network cards detected and the correct
drivers installed.
We will call these two computers a Windows cluster. You now you need some sort of
software that will help you to develop, deploy and execute applications over this cluster.
This software is the core of what makes a Windows cluster possible.
Software
The Message Passing Interface (MPI) is an evolving de facto standard for supporting
clustered computing based on message passing. There are several implementations of
this standard.
In this article, we will use mpich2, which is freely available and you can download it
here for Windows clustering, and find related documentation here . Please read the PDF
before starting the following steps.
Step 1: Download and unzip
permission.
mpich2 into any folder and share this folder with write
Step 2: Copy all files with the .dll extension from C:\MPICH2\lib to the
C:\Windows\system32 folder.
Step 3: Install the Cluster Manager Service on each host you want to use for remote
execution of MPI processes. For installation, start rcluma-install.bat (located in the
C:\MPICH2\bin directory) by double-clicking from the local or network-drive. You must
have administrator rights on the hosts to install this service.
Step 4: Follow step 1 and 2 for each node in the cluster (we will name each computer in
the cluster as node).
Step 5: Now Start RexecShell (from folder C:\MPICH2\bin) by double-clicking it:
Open the configuration dialog by pressing F2. The distribution contains a precompiled
example MPI program named cpi.exe (located in MPICH2/bin). Choose it as the actual
program. Make sure that each host can reach cpi.exe at the specified path.
Choose ch_wsock as the active plug-in. Select the hosts to compute on. On the tab
'Account', enter your username, domain and password, which need to be valid on each
host chosen. Press OK to confirm your selections. The Start Button (from the Window
RexecShell) is now enabled and can be pressed to start cpi.exe on all chosen hosts. The
output will be displayed in separate windows.
Congratulations -- your supercomputer (Windows cluster) is ready to run MPI programs!
Building a Linux Cluster
Linux clusters are generally more common, robust, efficient and cost effective than
Windows clusters. We will now look at the steps involved in building up a Linux cluster.
For more information go here .
Step 1
Install a Linux distribution (I am using Red Hat 7.1 and working with two Linux boxes) on
each computer in your cluster. During the installation process, assign hostnames and of
course, unique IP addresses for each node in your cluster.
Usually, one node is designated as the master node (where you'll control the cluster,
write and run programs, etc.) with all the other nodes used as computational slaves. We
name one of our nodes as Master and the other as Slave.
Our cluster is private, so theoretically we could assign any valid IP address to our nodes
as long as each has a unique value. I have used IP address 192.168.0.190 for the master
node and 192.168.0.191 for the slave node.
If you already have Linux installed on each node in your cluster, then you don't have to
make changes to your IP addresses or hostnames unless you want to. Changes (if
needed) can be made using your network configuration program Linuxconf in Red Hat.
Finally, create identical user accounts on each node. In our case, we create the user
DevArticle on each node in our cluster. You can either create the identical user accounts
during installation, or you can use the adduser command as root.
Step 2
We now need to configure rsh on each node in our cluster. Create .rhosts files in the user
and root directories. Our .rhosts files for the DevArticle users are as follows:
Master DevArticle
Slave DevArticle
Moreover, the .rhosts files for root users are as follows:
Master root
Slave root
Next, we create a hosts file in the /etc directory. Below is our hosts file for Master (the
master node):
192.168.0.190 Master.home.net Master
127.0.0.1
localhost
192.168.0.191 Slave
Step 3
Do not remove the 127.0.0.1 localhost line. The hosts.allow files on each node was
modified by adding ALL+ as the only line in the file. This allows anyone on any node
permission to connect to any other node in our private cluster. To allow root users to use
rsh, I had to add the following lines to the /etc/securetty file:
rsh, rlogin, rexec, pts/0, pts/1.
Also, I modified the /etc/pam.d/rsh file:
#%PAM-1.0
# For root login to succeed here with pam_securetty, "rsh" must be
# listed in /etc/securetty.
auth
sufficient /lib/security/pam_nologin.so
auth
optional
/lib/security/pam_securetty.so
auth
sufficient /lib/security/pam_env.so
auth
sufficient /lib/security/pam_rhosts_auth.so
account sufficient /lib/security/pam_stack.so service=system-auth
session sufficient /lib/security/pam_stack.so service=system-auth
Step 4
Rsh, rlogin, Telnet and rexec are disabled in Red Hat 7.1 by default. To change this, I
navigated to the /etc/xinetd.d directory and modified each of the command files (rsh,
rlogin, telnet and rexec), changing the disabled = yes line to disabled = no.
Once the changes were made to each file (and saved), I closed the editor and issued the
following command: xinetd –restart -- to enable rsh, rlogin, etc.
Step 5
Next, download the latest version of MPICH (UNIX all flavors) to the master node from
here. Untar the file in either the common user directory (the identical user you
established for all nodes "DevArticle" on our cluster) or in the root directory (if you want
to run the cluster as root).
Issue the following command:
tar zxfv mpich.tar.gz
Change into the newly created mpich-1.2.2.3 directory. Type ./configure, and when the
configuration is complete and you have a command prompt, type make.
The make may take a few minutes, depending on the speed of your master computer.
Once make has finished, add the mpich-1.2.2.3/bin and mpich-1.2.2.3/util directories to
your PATH in .bash_profile or however you set your path environment statement.
The full root paths for the MPICH bin and util directories on our master node are
/root/mpich-1.2.2.3/util and /root/mpich-1.2.2.3/bin. For the DevArticle user on our
cluster, /root is replaced with /home/DevArticle in the path statements. Log out and then
log in to enable the modified PATH containing your MPICH directories.
Step 6
Next, make all of the example files and the MPE graphic files. First, navigate to the
mpich-1.2.2.3/examples/basic directory and type make to make all the basic example
files.
When this process has finished, you might as well change to the mpich1.2.2.3/mpe/contrib directory and make some additional MPE example files, especially if
you want to view graphics.
Within the mpe/contrib directory, you should see several subdirectories. The one we will
be interested in is the mandel directory. Change into the mandel directory and type make
to create the pmandel exec file. You are now ready to test your cluster.
Testing Your Linux Cluster
The first program we will run is cpilog. From within the mpich-.2.2.3/examples/basic
directory, copy the cpilog exec file (if this file isn't present, use make command again) to
your top-level directory. On our cluster, this is either /root (if we are logged in as root) or
/home/DevArticle, if we are logged in as DevArticle (we have installed MPICH both
places).
Next, from your top directory, rcp the cpilog file to each node in your cluster, placing the
file in the corresponding directory on each node. For example, if I am logged in as
DevArticle on the master node, I'll issue rcp cpilog Slave:/home/ DevArticle to copy cpilog
to the DevArticle directory on Slave. I'll do the same for each node (if there are more
than two nodes). If I want to run a program as root, then I'll copy the cpilog file to the
root directories of all nodes on the cluster.
Congratulation your supercomputer (Linux cluster) is ready to run MPI programs!
Once the files have been copied, I'll type the following from the top directory of my
master node to test my cluster:
mpirun -np 1 cpilog
This will run the cpilog program on the master node to see if the program works
correctly. Some MPI programs require at least two processors (-np 2), but cpilog will
work with only one. The output looks like this:
pi is approximately 3.1415926535899406,
Error is 0.0000000000001474
Process 0 is running on Server.home.net
wall clock time = 0.360909
Now try all two nodes (or however many you want to try) by typing: mpirun -np 2 cpilog
and you'll see something like this:
pi is approximately 3.1415926535899406,
Error is 0.0000000000001474
Process 0 is running on Master.home.net
Process 1 is running on Slave.home.net
wall clock time = 0.0611228
The number following the -np parameter corresponds with the number of processors
(nodes) you want to use in running your program. This number may not exceed the
number of machines listed in your machines.LINUX file plus one (the master node is not
listed in the machines.LINUX file).
To see some graphics, we must run the pmandel program. Copy the pmandel exec file
(from the mpich-1.2.2.3/mpe/contrib/mandel directory) to your top-level directory and
then to each node (as you did for cpilog). Then, if X isn't already running, issue a startx
command. From a command console, type xhost + to allow any node to use your X
display, and then set your DISPLAY variable as follows:
DISPLAY=Server:0 (be sure to replace Server with the hostname of your master node).
Setting the DISPLAY variable directs all graphics output to your master node. Run
pmandel by typing: mpirun -np 2 pmandel
The pmandel program requires at least two processors to run correctly. You should see
the Mandelbrot set rendered on your master node.
Adding more processors (mpirun -np 10 pmandel) should increase the rendering speed
dramatically. The mandelbrot set graphic has been partitioned into small rectangles for
rendering by the individual nodes. You can actually see the nodes working as the
rectangles are filled in. If one node is a bit slow, then the rectangles from that node will
be the last to fill in. It is quite fascinating to watch.
This article was not written by Web Street. One of our customers
found it in a news room. We tested it and found it credible. We now
wish to share it with you. We take no responsibility, credit, fee or
referral from this article.