High Performance Computing May 2, 2017 Thomas Debray [email protected] HPC tutorial May 2, 2017 Section 0.0 www.netstorm.be/tutorialHPC.pdf Page 2 of 43 Contents 1 Introduction 1.1 Operating systems . . . . . . . 1.1.1 Windows . . . . . . . . 1.1.2 Linux . . . . . . . . . . 1.1.3 OS X . . . . . . . . . . 1.2 The HPC cluster . . . . . . . . 1.2.1 Architecture . . . . . . 1.2.2 Directories . . . . . . . 1.2.3 Programming languages 1.2.4 Fair share usage . . . . 1.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . 7 . 7 . 7 . 7 . 8 . 8 . 9 . 10 . 10 . 11 2 First steps 2.1 Acquire user details . . . . . . . . . . . . 2.2 Login on the HPC server . . . . . . . . . . 2.2.1 Windows . . . . . . . . . . . . . . 2.2.2 Linux . . . . . . . . . . . . . . . . 2.3 Loading software modules . . . . . . . . . 2.4 Start an R session . . . . . . . . . . . . . 2.5 Writing a singlethreaded R script . . . . . 2.6 Submitting a job . . . . . . . . . . . . . . 2.7 Monitoring a job . . . . . . . . . . . . . . 2.8 Processing a finished job . . . . . . . . . . 2.9 Transfering files to/from the HPC cluster 2.9.1 Windows . . . . . . . . . . . . . . 2.9.2 Linux . . . . . . . . . . . . . . . . 2.10 Logout from the HPC server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 14 15 15 17 18 19 20 22 23 24 25 25 25 26 . . . . . . . . 27 28 28 29 31 32 34 35 36 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Advanced topics 3.1 Automate login HPC server . . . . . . . 3.1.1 Windows . . . . . . . . . . . . . 3.1.2 Linux . . . . . . . . . . . . . . . 3.2 Access the HPC cluster from outside the 3.3 Installing new software . . . . . . . . . . 3.4 Splitting a script into multiple threads . 3.5 Submitting extensive jobs . . . . . . . . 3.6 Encrypting files [under construction] . . 3 . . . . . . . . . . . . UMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . HPC tutorial 3.7 Section 0.0 3.6.1 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Assessing HPC usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.7.1 Usage by research group (PI) . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4 Advanced programming with R 4.1 Installing an older package version . 4.2 Writing a multithreaded script . . . 4.3 Submitting repetitive jobs . . . . . . 4.4 Random seeds [under construction] . 4.5 Error Recovery [under construction] May 2, 2017 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . www.netstorm.be/tutorialHPC.pdf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 40 41 42 43 43 Page 4 of 43 Chapter 1 Introduction 5 HPC tutorial Section 1.0 Recent developments in computer and information technologies have facilitated many new forms of scientific research, including genetic research and prediction research. Although the computational power of personal computers generally conforms to standard word processing and statistical analysis, it does not meet the requirements for more advanced research topics. Some causes of this pitfall are listed below. • The amount of data, i.e. the size of datasets, is growing faster than ever. Particularly, records are being collected for a larger amount of subjects and contain an increasing amount of variables. For instance, since the successful sequencing of the human genome in 2000, genomewide association studies are increasingly being used to identify positions within the human genome that have a link with a disease condition. Because the human genome represents a 3.2 billion letter word, high dimensionality reduction techniques are needed to simplify the research focus. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 6 of 43 HPC tutorial Section 1.1 Operating systems Before we introduce the concepts of High Performance Computing, it is important to be familiar with operating systems. An operating system (OS) is a collection of software that manages computer hardware resources and provides common services for computer programs. The operating system is a vital component of the system software in a computer system. Application programs usually require an operating system to function. Well known operating systems are Windows, OS X and Linux. Windows Linux Linux is an operating system that evolved from a kernel created by Linus Torvalds when he was a student at the University of Helsinki. Although Linux and Windows have very little in common, many software packages (including R) are available for both platforms. In this tutorial, we focus on the following Linux distributions: • Linux Mint is based on Ubuntu and can be downloaded from http://www.linuxmint. com/. It features several desktop managers such as MATE, Cinnamon, KDE and Xfce. Although we use MATE in this tutorial, other choices should not affect the described procedures. • CentOS (Community enterprise Operating System) is derived entirely from the Red Hat Enterprise Linux distribution and strives to maintain 100% binary compatibility with its upstream source, Red Hat. Note : In contrast to Windows, the file-system of Linux is case-sensitive. This implies that Linux treats uppercase and lowercase letters differently, such that commands and filenames need to be carefully verified. Furthermore, Linux and Windows also adopt a different directory structure. For instance, the Windows home directory is typically located in C:\Users\username, whereas the Linux home directory usually resides in /home/username. Notice that the slashes are forward slashes in Linux versus backslashes in Windows and that there is no drive name (C:, D:, etc.) in Linux. At boot, all files, folders, devices and drives are mounted under the so-called root partition /. OS X Although we do not focus on OS X in this tutorial, most commands from Linux can be used in the terminal of OS X. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 7 of 43 HPC tutorial Section 1.2 The HPC cluster Architecture The HPC grid consists of several computers, each fulfilling a different role. The operating system on all computers of the HPC grid is CentOS. Typically, the user will communicate with hpcsubmit to submit one or more jobs to the grid. These jobs are then sent to a master server that handles job queuing, execution and monitoring. The master server will forward the job to one or more computing nodes. The user cannot directly communicate with the master server or any of the computing nodes. Finally, the server, the master and all computing nodes share a common storage server. A detailed setup of the HPC system in the UMC is illustrated below. File transmission (20Gb/s) User HPCSUBMIT HPCS03 HPCS04 HPCT01 HPCT02 HPCM01 Queuing HPCN001 HPCN002 ... HPCN064 Computing Bulk Storage HPC Storage Login & submission (2Gb/s) Storage In general, the HPC system comprises of several computers with different roles: • The submit hosts are intended to prepare and submit the different jobs that need to be executed. They can be accessed through the following address: hpcsubmit.op.umcutrecht.nl hpcs03.op.umcutrecht.nl hpcs04.op.umcutrecht.nl • The transfer hosts are intended to facilitate frequent (and large) data transfers, and each have a 20Gb/s network interface. The servers can be accessed by the following address: hpct01.op.umcutrecht.nl hpct02.op.umcutrecht.nl Currently, there are 64 computing servers consisting of 2 CPUs with 6 cores each (resulting in 12 cores or so-called slots per server, thus a total of 1544 available slots). Each server has a total May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 8 of 43 HPC tutorial Section 1.2 memory of 128 GB, the available memory for individual slots is limited to 15 GB. The Julius Center owns 2 of these computing servers, but it is possible to use additional servers when they are available. Figure 1.1: The HPC infrastructure; image taken from bioinformatics.holstegelab.nl. Directories The following directories are mounted on the HPC infrastructure: • /home/group/username The user home directory has a quota of 6 GB per user. This directory should only be used for small, personal files; not for input/output (e.g. log files, input files, output files) of the computing nodes. • /hpc/local/CentOS7/group Group-specific directory for installing binaries, libraries and manuals that are shared with other group members. Please do not remove or overwrite files without consulting the group coordinator. There is a quota of 1 TB over all groups. Members of the Julius Center can become member of the following groups: julius te (theoretical epidemiology), julius bs (bioinformatics) and julius id (infectuous diseases). Please read section 2.1 for more information. • /hpc/shared/group Group-specific directory for sharing large files such as datasets. There is a quota of 5 TB over all groups. • /hpc/group The bulk storage server is intended for archiving and back-up, and may for instance be used to store large datasets and results of analyses. Disk space can be rented for 320 euro per TB per year • /tmp Temporary space available on each computing node (shared between all users, maximum size of 150 GB ) May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 9 of 43 HPC tutorial Section 1.2 Input and output files for the computing nodes (e.g. result files) should be stored on performant storage of /hpc/shared/group or /hpc/group. Programming languages By default, the HPC server and nodes are able to compile/interpret the following programming languages: • Java • Python • R • mpi If necessary, users can also install their own software packages using module environments. More information on this is provided in the HPC wiki (Take me there!). Fair share usage Jobs are scheduled according to a fair share usage scheme. Each group participating in the HPC project is given a number of share tickets dependent on the financial investments made. Scheduling of jobs depend on the shares of a group and the accumulated past usage of that group. Usage is adjusted by a decay factor, a half-life of one week, such that “old” usage has less impact. The potential resource share of a group is constantly adjusted. Jobs associated to groups that consumed fewer resources in the past are preferred in the scheduling. At the same time, full resource usage is guaranteed, because unused shares are available for pending jobs associated with other groups. In other words, if a group does not submit jobs during a given period, the resources are shared among those who do submit jobs. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 10 of 43 HPC tutorial Section 1.3 Notation In this tutorial, we provide several linux-based scripts that need to be executed on the local computer or on the HPC submission host. To distinguish between both types, we use the notation loc for commands that need to be executed locally (i.e. on the personal computer), and the notation hpc for commands that need to be executed on the HPC submit host. As an example, consider we want to display the current working directory, which can be achieved using the command pwd. If the command should be executed on the local computer, we use [loc ˜]$ pwd Conversely, if the command should be executed on the HPC server we use [hpc ˜]$ pwd The symbol ˜ is used to indicate that the command should be executed in the user’s home directory. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 11 of 43 HPC tutorial May 2, 2017 Section 1.3 www.netstorm.be/tutorialHPC.pdf Page 12 of 43 Chapter 2 First steps 13 HPC tutorial Section 2.1 Acquire user details There are three High Performance Computing (HPC) groups in the Julius Center. You need to become a member of one of these groups in order to obtain a user account. • Biostatistics (julius bs), administered by René Eijkemans • Theoretical Epidemiology (julius te), administered by Thomas Debray • Infectuous Diseases (julius id ), administered by Mirjam Kretzschmar Here, we use a dummy username (username) and password (password123 ). Evidently, you have to replace these entries with your personal credentials. Note: You can request a personal user account by filling the form on http://www.netstorm.be/HPC-form.pdf and e-mailing it to the relevant group administrator. Once you have received your HPC username and password, you should visit the HPC wiki on https://hpcwiki.op.umcutrecht.nl. The wiki contains useful information on the HPC infrastructure and provides basic help for first-time users. You can create a new user or browse the help contents. You might also want to visit the website of the Utrecht Bioinformatics Center (UBC) at https: //ubc.uu.nl/. This website contains additional information about the HPC infrastructure, and provides information about upcoming courses, workshops and seminars. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 14 of 43 HPC tutorial Section 2.2 Login on the HPC server It is possible to login on the HPC server by means of Secure Shell (SSH) from the UMC’s open and gesloten network. These networks comprise the Julius Center wired network (IP 10.121.*.*) and the UMCU-PORTAL wireless network (IP 10.132.*.*). The HPC cluster can also be reached from outside the UMC open network by first connecting to an SSH gateway (see HPC wiki). Windows Although Windows does not support SSH, several free clients are available. In this tutorial we will use PuTTY which can be downloaded from http://the.earth.li/˜sgtatham/putty/ latest/x86/putty.exe. A major advantage of PuTTY is that it does not require administrative rights to be installed. This implies that the software can run directly from the personal folders or a USB stick. Save the file putty.exe and run it by double-clicking. You should get the following warning: Choose Uitvoeren to start PuTTY and open the main window. In the Session category, specify hpcsubmit.op.umcutrecht.nl as host name and choose Open in the bottom of the screen May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 15 of 43 HPC tutorial Section 2.2 If this is the first time you connect, you will get a PuTTY Security Alert indicating that the server’s host key is not cached in the registry. Choose Yes to add the server’s rsa2 key fingerprint to PuTTY’s cache and carry on connecting. Finally, a terminal will open where you will be prompted to provide your username and password: If you see the following command line [username@hpcs ˜]$ you are succesfully logged in on the HPC server! May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 16 of 43 HPC tutorial Section 2.2 Linux This section describes the HPC login procedure for Linux users. Skip this section if you are using Windows on your personal computer. Open the terminal and type the following command to login on the HPC server. [loc ˜]$ ssh -l username hpcsubmit.op.umcutrecht.nl Alternatively, you may use [loc ˜]$ ssh [email protected] or, if you would like to make use of the X-server (e.g. to run Rstudio): [loc ˜]$ ssh -X [email protected] You will now be asked to provide your password. Once logged in, you should see the following command line: May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 17 of 43 HPC tutorial Section 2.3 Loading software modules The submission server and computing nodes have been configured in such a way that many software packages are not available by default. In order to use a certain software package, the corresponding module first needs to be loaded. The following modules are pre-installed on the submission server and computing nodes: • R • Rstudio • python • java • openmpi The available modules can also be displayed using the following command: [hpc ˜]$ module avail An advantage of using modules is that switching between different software versions becomes easier. For instance, on the current system, there are 3 versions of the R software installed (version 3.2.2, 3.2.4 and 3.3.0). We can load R 3.3.0 using the following command: [hpc ˜]$ module load R/3.3.0 May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 18 of 43 HPC tutorial Section 2.4 Start an R session Given that the proper R module has been loaded, we can start the software by typing R in the terminal. We will now install the packages doMC, multicore and foreach as we will use them later in the examples. > install.packages(’doMC’) > install.packages(’foreach’) > install.packages(’multicore’) By default, R will install these packages in /home/group/user /R. It is possible to specify a local path as follows: > install.packages( "yourLibrary", lib = "/hpc/local/version/group/path" ) > library( "yourLibrary", lib.loc = "/hpc/local/version/group/path" ) Note that all required packages should be installed in order to allow the execution of your scripts. Because upgrades of R (eg. from 2.15 to 3.0) are not immediately pushed, it is possible that some packages have become depreciated. For instance, nlme is no longer available for R 2.15.2, and an older version (3.1-108) needs to be installed (See section 4.1). You can quit R by typing quit(). Warning The HPC server is only designed for text editing and submission of cluster jobs. Do NOT run jobs on this server, as it is not meant for doing any sort of computation. Any long-running jobs found running on the server will be KILLED WITHOUT NOTICE. You will lose any data and/or computations associated with the running job. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 19 of 43 HPC tutorial Section 2.5 Writing a singlethreaded R script Below, we describe a small script that uses a for loop to calculate the square root of some numbers. A typical for loop would calculate these square roots one by one, on a single core. > > > > ptm <- proc.time() q <- array(NA,dim=3) for(i in 1:100000) q[i] <- sqrt(i) proc.time() - ptm user system elapsed 13.517 0.884 14.419 It is possible to speed up calculations by using the apply function: > ptm <- proc.time() > q <- apply(as.array(1:100000), 1, sqrt) > proc.time() - ptm user system elapsed 0.440 0.004 0.444 Specifically, by replacing the for loop with the apply function, we have reduced the overall calculation time from 14 seconds to less than 1 second! Although we can run previous scripts on our personal computer, this strategy is not always desirable. For instance, it is possible that some calculations require substantial amounts of system memory, or may take several days to finish. The calculation of prime numbers is a good example. In such scenarios, the HPC system provides a neat solution. Below, we create our first script on the HPC server to calculate all prime numbers up to 100 000 by using the text editor nano (other installed text editors are nedit and emacs): [hpc ˜]$ mkdir Rcode [hpc ˜]$ cd Rcode [hpc ˜]$ nano myscript.r Now type the following code: myscript.r prime <- function(n) { n <- as.integer(n) if(n > 1e8) stop(’n too large’) primes <- rep(TRUE, n) primes[1] <- FALSE last.prime <- 2L fsqr = floor(sqrt(n)) while (last.prime <= fsqr) { primes[seq.int(2L*last.prime, n, last.prime)] <- FALSE sel <- which(primes[(last.prime+1):(fsqr+1)]) last.prime <- if(any(sel)) last.prime + min(sel) else fsqr+1 } which(primes) } ptm <- proc.time() primes <- prime(100000) elapsed <- proc.time() - ptm save.image() # save workspace Press Ctrl–O to save, and confirm by ENTER. Finally, press Ctrl–X to quit. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 20 of 43 HPC tutorial Section 2.5 Note that the calculated prime numbers are stored in the primes variable, and the elapsed processing time in the elapsed variable. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 21 of 43 HPC tutorial Section 2.6 Submitting a job To submit our R-script with qsub we need to create a (single-threaded) job. We can define our job by writing shell script in bash: [hpc ˜]$ nano runR.sh The shell script needs to contain the following text: runR.sh #!/bin/bash module load R/3.3.0 R CMD BATCH Rcode/myscript.r Save and close the file. We can submit our R script to the Grid Engine queuing system as follows: [hpc ˜]$ qsub runR.sh By default, submitted jobs will run for a maximum for 10 minutes. If the job (runR.sh) is not finished in time, it will be aborted. It may therefore often be necessary to request a specific runtime. This can be achieved by setting the qsub parameter h rt. For instance, we can force our script to run for 1 hour, 10 minutes and 5 seconds as follows [hpc ˜]$ qsub -l h_rt=01:10:05 runR.sh By default, a job gets 10 GB of memory. More information about requesting additional memory can be found in section 3.5. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 22 of 43 HPC tutorial Section 2.7 Monitoring a job It is possible to track the status of all our jobs with qstat: An overview of the information provided by qstat: job-ID the job ID, which can for instance be used to remove jobs: [hpc ˜]$ qdel 32094 prior the priority of the job determining its position in the pending jobs list (ranges between 0 and 1; the higher a job’s priority value, the earlier it gets dispatched). name the job name (i.e. runR.sh) user the user name of the job owner (i.e. your user name username) state the status of the job - one of d(eletion), E(rror), h(old), r(unning), R(estarted), s(uspended), S(uspended), t(ransfering), T(hreshold) or w(aiting). submit/start at the submission or start time and date of the job. queue the queue the job is assigned to (for running or suspended jobs only). slots the number of job slots or the function of parallel job tasks. ja-task-ID the array job task id. Will be empty for non-array jobs. Not used in this example. More information about qstat can be obtained through man (Press ‘q’ to exit the manual): [hpc ˜]$ man qstat It is also possible to track R output generated from the clusters. The output of R can be found in the file myscript.r.Rout, which can be accessed as follows: [hpc ˜]$ cat myscript.r.Rout and should provide something as follows: May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 23 of 43 HPC tutorial Section 2.8 Processing a finished job Once the job is finished (i.e. the job is no longer visible in qstat), go back to your home directory. We can access the final R workspace by simply opening R (our session will be recovered from .RData) and list the available objects with ls: > ls() [1] "elapsed" "prime" "primes" "ptm" > tail(primes) [1] 99923 99929 99961 99971 99989 99991 where the highest prime number below 100 000 is given as 99 991. Once all the results are processed, we can close R and delete the generated files as follows: [hpc ˜]$ rm myscript.r.Rout Rtest.sh.* We can also delete the working directory containing all results: [hpc ˜]$ rm .RData May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 24 of 43 HPC tutorial Section 2.9 Transfering files to/from the HPC cluster We can copy files from our home directory on the HPC server to our personal computer and vice versa. Hereto, we can use a dedicated server for file transfer: hpct01.op.umcutrecht.nl and hpct01.op.umcutrecht.nl. These servers have a bandwidth of 20Gb/s per transfer host (compared to 2Gb/s for the login hosts such as hpcs01 ). Here, we use scp to use the encrypted connection over SSH for transfering the files. Windows Download WinSCP on http://winscp.net/eng/index.php Linux Open the terminal on your personal computer and type the following command to copy the .RData workspace from Section 2.8 to your home directory: [loc ˜]$ scp [email protected]:˜/.RData ˜/.RData We can copy a local file to our home directory on the HPC server as follows: [loc ˜]$ scp ˜/localfile.r [email protected]:˜/ May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 25 of 43 HPC tutorial Section 2.10 Logout from the HPC server To logout from the HPC cluster: [hpc ˜]$ exit May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 26 of 43 Chapter 3 Advanced topics 27 HPC tutorial Section 3.1 Automate login HPC server It is possible to login on the HPC server without having to provide a password each time 1 . Windows Download PuTTYgen and start it by double-clicking its executable file 2 . Choose SSH-2 RSA under Type of key and specify 2048 as the Number of bits in a generated key. Then click on Generate. Your personal key-pair will now be generated based on mouse movements over the blank area in the PuTTYgen screen. Once the private and public key have been generated, you can provide it with additional information and a passphrase. You’ll need that passphrase to log in to SSH with your new key; we will adopt a dummy passphrase passphrase123 here. Then click on Save public key and save it as id rsa.pub in some safe location on your computer. Then click on Save private key and save it as id rsa.ppk. Finally, copy the public key from the PuTTYgen window. Open PuTTY to login on the HPC server and create a directory .ssh: [hpc ˜]$ mkdir .ssh [hpc ˜]$ chmod 0700 .ssh [hpc ˜]$ nano ˜/.ssh/authorized_keys Now paste the contents of your public key id rsa.pub and save the file. Afterwards, change the file permissions to be write/readable only by yourself: [hpc ˜]$ chmod 600 ˜/.ssh/authorized_keys Finally, open the PuTTY configuration window and enter your username (username) in the field Auto-login username at Connection → Data. Afterwards, load the private key id rsa.ppk in Connection → SSH → Auth. 1 2 This information was obtained from https://help.github.com/articles/generating-ssh-keys. This information was obtained from http://www.howtoforge.com/ssh_key_based_logins_putty. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 28 of 43 HPC tutorial Section 3.1 Then go to Session again and click on Save. Now everything is ready for our first key-based login to our SSH server. Click on Open to authenticate with the public key. Linux Step 1: Check for SSH keys First, we need to check for existing ssh keys on our personal computer. Open up the command line and run: [loc ˜]$ cd ˜/.ssh With this command, we can check if there is a directory named .ssh in our user directory. If it says No such file or directory skip to step 3. Otherwise continue to step 2. Step 2: Backup and remove existing SSH keys Since there is already an SSH directory you’ll want to back the old one up and remove it: [loc ˜]$ mkdir key_backup [loc ˜]$ cp id_rsa* key_backup [loc ˜]$ rm id_rsa* May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 29 of 43 HPC tutorial Section 3.1 Step 3: Generate a new SSH key To generate a new SSH key, enter the code below: [loc ˜]$ ssh-keygen -t rsa -C "[email protected]" Now you need to enter a passphrase to secure your private key. Here, we will use a dummy passphrase passphrase123 (this passphrase is not secure!). Subsequently, two files will be created in the .ssh directory: • ∼/.ssh/id rsa : identification (private) key • ∼/.ssh/id rsa.pub : public key Step 4: Add your public key to the HPC server Use scp to copy the id rsa.pub (public key) to the HPC server as authorized keys file, this is know as Installing the public key to server. [loc ˜]$ ssh [email protected] "mkdir .ssh; chmod 0700 .ssh" [loc ˜]$ scp ˜/.ssh/id_rsa.pub [email protected]:.ssh/ authorized_keys Finally, add the generated key to ssh-agent. [loc ˜]$ ssh-agent $BASH [loc ˜]$ ssh-add [loc ˜]$ ssh-agent sh -c ’ssh-add < /dev/null && bash’ Again, enter your passphrase passphrase123. From now on you can log into the HPC server as username from your personal computer without password: [loc ˜]$ ssh [email protected] Note: you can avoid a passphrase prompt at each login session by creating a key-pair without a passphrase. This strategy, however, implies that anyone with access to your computer can directly access the HPC server. Furthermore, if someone gets hold of your private key, they can access the HPC server whilst taking your identity. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 30 of 43 HPC tutorial Section 3.2 Access the HPC cluster from outside the UMC The HPC cluster can be reached from outside the UMC open network by first connecting to an SSH gateway, from which you can log in to the login/submission servers hpcs01 and hpcs02. You can connect to the SSH gateway by two different ways, as described below. Step 1: Generate a new SSH key Please check section 3.1. Step 2: Share your public SSH key Look up your public SSH key and send it together with your name, address and telephone number to [email protected]. If you are not an employee of the UMC Utrecht, please also provide a contact within the UMC Utrecht. Step 3: Login on the SSH gateway Once SSH public key authentication is enabled, you can log in to the SSH gateway by using your private key: [loc ˜]$ ssh -i ˜/.ssh/id_rsa -X [email protected] You will be prompted for the passphrase that you entered during the ssh-keygen command. If, instead of a “passphrase” prompt, you see something like “otp-md5 349 ho8323 ext, Response:”, please press “Ctrl-C” to abort the connection. This is the prompt for the “one-time password” authentication, and will lead to an automatic ban. From the gateway machine, continue to the HPC cluster: [loc ˜]$ ssh -X [email protected] May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 31 of 43 HPC tutorial Section 3.3 Installing new software It is likely that the default software on your computer will not be sufficient. Here, we illustrate how to install JAGS on the HPC system. First, download JAGS on http://sourceforge.net/ projects/mcmc-jags/; this will provide you with a file named like JAGS-3.4.0.tar.gz. Upload this file to a new directory tmp on your HPC account (see section 2.9) and then login on the HPC server. Type the following code to extract the package: [hpc ˜]$ cd ˜/tmp [hpc tmp]$ tar -zxvf JAGS-3.4.0.tar.gz [hpc tmp]$ cd JAGS-3.4.0 Now, we have to “configure” and “make” this package. Assuming you are a member of the group julius te, we can confuge the software to be installed in the relevant directory as follows: [hpc JAGS-3.4.0]$ ./configure --prefix=/hpc/local/CentOS6/julius_te/JAGS-3.4.0 [hpc JAGS-3.4.0]$ make [hpc JAGS-3.4.0]$ make install Once the software is installed, we can remove the temporary files and update the paths [hpc JAGS-3.4.0]]$ cd ˜ [hpc ˜]$ rm -r /tmp [hpc ˜]$ nano ./bash_profile We need to add one extra line before “export PATH ” to ensure the libaries of JAGS are loaded when logging in on the HPC system. In the file below, we can see that the PATH variable is amended to load libraries of R 3.1 and JAGS 3.4.0. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 32 of 43 HPC tutorial Section 3.3 .bash profile # .bash_profile # Get the aliases and functions if [ -f ˜/.bashrc ]; then . ˜/.bashrc fi # binaries for R PATH=/hpc/local/CentOS6/julius_te/R-3.1.0/bin:$PATH # binaries for JAGS PATH=/hpc/local/CentOS6/julius_te/JAGS-3.4.0/bin:$PATH export PATH Save and exit using Ctrl-o and subsequently Ctrl-x. Reload .bash profile using [hpc ˜]$ source ˜/.bash_profile It is possible to call JAGS from within R, we then need to install the R package rjags. Open R, and type: > install.packages( pkg = "rjags", lib = "/hpc/local/CentOS6/julius_te/R-3.1.0/lib64/R/library", repos = "http://cran.us.r-project.org" , configure.args = "--with-jags-include=/hpc/local/CentOS6/julius_te/JAGS-3.4.0/ include/JAGS --with-jags-lib=/hpc/local/CentOS6/julius_te/JAGS-3.4.0/lib -enable-rpath", dependencies=TRUE) This command allows to specify the directory of all relevant binaries, as the traditional install.packages command may fail when you have installed a 64-bit version of R. In such scenarios, the installer of rjags will look for a directory /hpc/local/CentOS6/julius_te/JAGS-3.4.0/lib64/ which does not exist. We can effectively correct for this automated mistake by specifying --withjags-lib=/hpc/local/CentOS6/julius te/JAGS-3.4.0/lib. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 33 of 43 HPC tutorial Section 3.4 Splitting a script into multiple threads Because multi-threading of R scripts may not always be feasible, it is sometimes preferred to divide scripts into multiple smaller parts that can be executed independent of each other. It is then possible to submit a so-called array job where multiple single-threaded scripts are submitted simultaneously. For instance, in the example of Section 4.2 where we defined a parallel environment, a maximum of 12 slots could be used (dependent on the queue). Conversely, by submitting an array job, up to 1544slots could be used (in theory, if the cluster is empty). May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 34 of 43 HPC tutorial Section 3.5 Submitting extensive jobs Although HPC is designed to prioritize the execution of small and short jobs, it is possible to submit jobs that require prolonged access to computational power and/or require more random-access memory. Below, we highlight the different possible strategies Increasing execution time • Alter the maximum runtime of your script using h rt (section 2.6). [hpcs01 ˜]$ qsub -l h_rt=HH:MM:SS myjob.sh • Rewrite your job as a series of smaller jobs that can be run in parallel (section 4.2). Increasing random-access memory • By default, a job gets 10 GB of memory. More memory can be requested using the h vmem parameter. In the new cluster setep, memory requests are per job, independent of the number of slots requested. For instance, we can request a slot where 100GB is available as follows: [hpc ˜]$ qsub -l h_vmem=100G myjob.sh May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 35 of 43 HPC tutorial Section 3.6 Encrypting files [under construction] Linux We will use the Gnu Privacy Guard to encrypt files with a secret key. The approach is similar to SSH keys, and creates a key-pair consisting of a secret and public key: $ gpg --gen-key You will first need to specify an encryption algorithm (possible options are RSA/RSA or DSA/ElGamal) and a key length (e.g. 1024 bits). Although longer keys are more secure, they increase the encryption/decryption times. Current guidelines recommend a key-size of minimal 2048 bits. The system now asks to enter names, comment and e-mail address. Finally, you need to provide a passprhase which is needed to use the functionality which belongs to your secret key. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 36 of 43 HPC tutorial Section 3.7 Assessing HPC usage You can always evaluate the HPC usage of you and other users or groups at http://hpcstats. op.umcutrecht.nl/ (login with your HPC account). Usage by research group (PI) 1. Select Jobs by PI in the menu left 2. Choose Filter in the menu bar 3. Select the group of interest (e.g. “julius”). 4. Click OK You can compare usage statistics of individual users by browsing to the item CPU Hours: and then clicking on the option by User. Example output is given below. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Total Page 37 of 43 HPC tutorial May 2, 2017 Section 3.7 www.netstorm.be/tutorialHPC.pdf Page 38 of 43 Chapter 4 Advanced programming with R 39 HPC tutorial Section 4.1 Installing an older package version Sometimes, you need to install an older version of a package to get it working within R. This is because the R version on the HPC cluster is not immediately upgraded, and packages on CRAN may require the latest R version. As an example, we will install the latest version of nlme that is compatible with R 2.15.2. First, visit the CRAN package page (http://cran.r-project.org/web/packages/nlme/ index.html) and go to item Old sources. Open the corresponding link and find the latest package version that is compatible with your version of R. You can check this by reading the package dependencies in the DESCRIPTION file of each archive. Notice that for nlme 3.1-108.tar.gz, we have: Depends: graphics, stats, R (>= 2.14.0), R (< 3.0.0) This package should work under R 2.15.2, so download it. Once the file is downloaded, copy it to your home directory on the HPC server (see section 2.9). Finally, login on the HPC server, start R, and type the following: > install.packages(’nlme_3.1-108.tar.gz’, repos = NULL, type=’source’) > library(nlme) You should now be able to use nlme within R! May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 40 of 43 HPC tutorial Section 4.2 Writing a multithreaded script Although it is possible to run your original scripts directly on the HPC computer cluster, it is much more efficient to optimize your code such that all available resources are used for calculations. This improvement was only due to efficient coding, further improvements may be gained by dividing the calculations amongst the available cores. This can be achieved fairly straightforward with the package doMC, where a parallel for loop is defined. We will now combine the power of tapply with the versatility of doMC. Here, it is important to realize that each HPC cluster node has a CPU with a fixed number of cores (typically 12). It is therefore important to divide the tapply calculations equally amongst these cores. Although it is recommended to write R scripts on your personal computer, we can directly create an R script on the HPC server: myscript.r library(doMC) registerDoMC() npar <- getDoParWorkers() # get parallel workers ptm <- proc.time() q <- foreach(i=1:npar) %dopar% { n0 <- ((i-1)*(100000%/%npar))+1; n1 <- if(i<npar)n0-1+(100000%/%npar) else 100000; apply(as.array(n0:n1), 1, sqrt) } elapsed <- proc.time() - ptm q <- unlist(q) save.image() # save workspace This time, we submit the script as a job to veryshort to ensure a maximum amount of cores can be used for parallellization. The job will be queued until a machine with 12 unused slots becomes available. [hpc ˜]$ qsub -pe threaded 12 -q veryshort runR.sh Our script calculates how many cores are availabe for parallellization, and stores the resulting estimate in npar. Afterwards, each core is provided with a different sequence of numbers to be square rooted. For instance, when 4 cores are available (i.e. npar = 4), this sequence will be as follows: 1:25000 (core 1), 25001:50000 (core 2), 50001:75000 (core 3) and 75001:100000 (core 4). The results are stored as a list in q and retransformed in to a vector by means of unlist. Finally, we calculate the elapsed processing time (elapsed) and store the workspace. Note that it is not possible to use more than 12 cores for parallelization because no single machine has that many cores available. May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 41 of 43 HPC tutorial Section 4.3 Submitting repetitive jobs In some situations, researchers need to execute a certain script multiple times with different setup parameters. For instance, when performing simulation studies, it is common to apply a series of methods to different scenarios. Although it is possible to write an R script and corresponding shell script for each scenario, it is more elegant (and less work) to write one generic R script that can be called multiple times. In the following example, we are interested in the distribution of the multiplication of two variables a and b that follow a bivariate Normal distribution. We will estimate the mean and standard deviation of this distribution by performing a Monte Carlo simulation for different scenarios. Create the following R script to prepare the simulation: myScript.r args <- commandArgs(trailingOnly = TRUE) library(mvtnorm) meanA <- as.numeric(args[1]) meanB <- as.numeric(args[2]) sigmaA <- as.numeric(args[3]) sigmaB <- as.numeric(args[4]) rhoAB <- as.numeric(args[5]) if (sigmaA < 0 | sigmaB < 0) { stop("Invalid value for sigma") } else if (abs(rhoAB) > 1) { stop(paste("Invalid value for rho: ", rhoAB)) } S <- matrix(NA, 2, 2) S[1,1] <- sigmaA**2 S[2,2] <- sigmaB**2 S[1,2] <- S[2,1] <- sigmaA*sigmaB*rhoAB samples <- rmvnorm(100000, mean=c(meanA, meanB), sigma=S) mult <- samples[,1]*samples[,2] print(mean(mult)) print(sd(mult)) We can now evaluate the situation where a and b are independent and are distributed according to a ∼ N (10, 1.52 ) and b ∼ N (15, 1.32 ) by creating the following shell script: runRsim1.sh #!/bin/bash Rscript myScript.r 10 15 1.5 1.3 0 We can modify the shell script as follows to investigate the situation where a and b have a correlation of 0.3: runRsim2.sh #!/bin/bash Rscript myScript.r 10 15 1.5 1.3 0.3 May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 42 of 43 HPC tutorial Section 4.5 Random seeds [under construction] sing the doRNG package Error Recovery [under construction] Use try and check for errors as follows: { # example dopar iteration f <- try(mfp(fmla, family = cox, data = ds, select = 0.05, verbose = F)) if (!inherits(f,"try-error")) { # Process results } } May 2, 2017 www.netstorm.be/tutorialHPC.pdf Page 43 of 43
© Copyright 2025 Paperzz