Difference between revisions of "Using the IRIDIA Cluster"

From IridiaWiki
Jump to navigationJump to search
 
(45 intermediate revisions by 2 users not shown)
Line 1: Line 1:
  +
See [[http://majorana.ulb.ac.be/wordpress/ http://majorana.ulb.ac.be/wordpress/]]
== Cluster composition ==
 
 
Currently the IRIDIA cluster is composed by two racks. The first one contains 1 server (majorana) + 32 computational nodes (from c0-0 to c0-31), while the second contains 16 computational nodes (from c1-0 to c1-15). Each of the older units (from c0-0 to c0-31) features 2 CPUs AMD Opteron 244 (each with 1MB L2 cache) working at 1,75GHz and 2GB of RAM. Nodes from c0-0 to c0-15 have 4 modules of 512MB 400MHz DDR ECC REG DIMM. Nodes from c0-16 to c0-31 have 8 modules of 256MB 400MHz DDR ECC REG DIMM. Each of the newer units (from c1-0 to c1-15) features 2 Dual-Core AMD Opteron Processors 2216 HE (each with 2x1MB L2 cache) working at 2,4GHz and 4GB of RAM. In total the cluster is composed of 96 CPUs (64 single-core + 32 dual-core) dedicated to computations and 2 CPUs for administrative purposes.
 
 
 
COMPLEX_NAME: '''opteron244'''
 
 
- 2 Single-Core AMD Opteron244 @ 1,75GHz
 
 
nodes: c0-0, c0-1, c0-2, c0-3, c0-4, c0-5, c0-6, c0-7, c0-8, c0-9, c0-10, c0-11, c0-12, c0-13, c0-14, c0-15, c0-16, c0-17, c0-18, c0-19, c0-20, c0-21, c0-22, c0-23, c0-24, c0-25, c0-26, c0-27, c0-28, c0-29, c0-30, c0-31
 
 
 
COMPLEX_NAME: '''opteron2216'''
 
 
- 2 Dual-Core AMD Opteron2216 HE @ 2,4GHz
 
 
nodes: c1-0, c1-1, c1-2, c1-3, c1-4, c1-5, c1-6, c1-7, c1-8, c1-9, c1-10, c1-11, c1-12, c1-13, c1-14, c1-15
 
 
 
=== Claster status ===
 
 
{| border=1 cellspacing=0 cellpadding=2
 
! Node !! Effect !! Date !! Note
 
|-
 
| c0-0 || offline || 13 Jul 07 || cpu-fan broken
 
|-
 
| c0-1 || offline || 11 Jul 07 ||
 
|-
 
| c0-2 || || ||
 
|-
 
| c0-3 || || ||
 
|-
 
| c0-4 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-5 || offline || 13 Jul 07 || dead hard-disk
 
|-
 
| c0-6 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-7 || || ||
 
|-
 
| c0-8 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-9 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-10 || || ||
 
|-
 
| c0-11 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-12 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-13 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-14 || || ||
 
|-
 
| c0-15 || || ||
 
|-
 
| c0-16 || || ||
 
|-
 
| c0-17 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-18 || offline || 13 Jul 07 || dead hard-disk
 
|-
 
| c0-19 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-20 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-21 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-22 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-23 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-24 || offline || 13 Jul 07 || power supply broken
 
|-
 
| c0-25 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-26 || || ||
 
|-
 
| c0-27 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-28 || || ||
 
|-
 
| c0-29 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-30 || offline || 24 Aug 07 || halted for broken air-conditioner
 
|-
 
| c0-31 || || ||
 
|-
 
| c1-0 || || ||
 
|-
 
| c1-1 || || ||
 
|-
 
| c1-2 || || ||
 
|-
 
| c1-3 || || ||
 
|-
 
| c1-4 || || ||
 
|-
 
| c1-5 || || ||
 
|-
 
| c1-6 || || ||
 
|-
 
| c1-7 || || ||
 
|-
 
| c1-8 || || ||
 
|-
 
| c1-9 || || ||
 
|-
 
| c1-10 || || ||
 
|-
 
| c1-11 || || ||
 
|-
 
| c1-12 || || ||
 
|-
 
| c1-13 || || ||
 
|-
 
| c1-14 || || ||
 
|-
 
| c1-15 || || ||
 
|-
 
|}
 
 
== Queues ==
 
 
Each computational node has the following queues:
 
 
 
*'''<node>.short''': max 2 jobs can run in the queue concurrently at nice-level 2. Each job can only run for '''maximum 24h of CPU time''' (real execution of the program, without counting the time needed by the system for multitasking, etc). If a job still runs after the 24th hour, it will receive a signal SIGUSR1 and after some more time a SIGKILL that will terminate it.
 
 
*'''<node>.medium''': max 2 jobs can run in the queue concurrently at nice-level 3 (lower priority than the short ones). Each job can only run for '''maximum 72h of CPU time''' (real execution of the program, without counting the time needed by the system for multitasking, etc). If a job still runs after the 72nd hour, it will receive a signal SIGUSR1 and after some more time a SIGKILL that will terminate it.
 
 
*'''<node>.long''': max 2 jobs can run in the queue concurrently at nice-level 3 (lower priority than the short ones). Each job can only run for '''maximum 168h of CPU time''' (real execution of the program, without counting the time needed by the system for multitasking, etc). If a job still runs after the 168th hour, it will receive a signal SIGUSR1 and after some more time a SIGKILL that will terminate it.
 
 
Summarizing: on each node can run concurrently up to 6 jobs (distributed on 2 CPUs) with an average space in RAM of 341MB per job. The queueing system can run max 384 concurrent jobs on the whole cluster.
 
 
 
'''YOU HAVE TO DESIGN YOUR COMPUTATIONS IN SUCH A WAY THAT EACH SINGLE JOB DOESN'T RUN FOR MORE THAN 7 DAYS (of CPU time)'''.
 
 
 
== How to submit a job ==
 
 
 
To submit a job you have to create a script (that we indicate with SCRIPT_NAME.sh) with the commands you want to execute. Once you have the script, from majorana execute the command qsub SCRIPT_NAME.sh
 
 
 
To submit a job that lasts up to 1 day in no matter which node, your script should begin like this:
 
 
#!/bin/bash
 
#$ -N NAME_OF_JOB
 
#$ -cwd
 
 
 
 
To submit a job that lasts up to 1 day on a specific kind of node you must add to your script a complex name using the line '''#$ -l opteron244''' or the line '''#$ -l opteron2216''', otherwise the job can be scheduled indifferently on any node. You can specify also the kind of queue with the line '''#$ -l short''', otherwise the job can be scheduled indifferently on any queue. Here is an example of script:
 
 
#!/bin/bash
 
#$ -N test_short
 
#$ -l opteron244
 
#$ -l short
 
#$ -cwd
 
 
 
 
To submit a job that lasts up to 3 days you have to add the line '''#$ -l medium''' in the shell script passed at the qsub command, like in this example:
 
 
#!/bin/bash
 
#$ -N test_medium
 
#$ -l opteron2216
 
#$ -l medium
 
#$ -cwd
 
 
 
To submit a job that lasts up to 7 days you have to add the line '''#$ -l long''' in the shell script passed at the qsub command, like in this example:
 
 
#!/bin/bash
 
#$ -N test_long
 
#$ -l opteron2216
 
#$ -l long
 
#$ -cwd
 
 
 
To submit a job that runs in the parallel environment (MPI) you have to add the line '''#$ -pe mpi NUM_PROCESS''' in the shell script passed at the qsub command, like in this example:
 
 
#!/bin/bash
 
#$ -N test_parallel
 
#$ -l opteron244
 
#$ -l short
 
#$ -pe mpi 10
 
#$ -cwd
 
 
 
'''THE SCHEDULER CANNOT PUT IN EXECUTION MORE THAN 64 JOBS OF THE SAME USER AT THE SAME TIME. IF YOU SUBMIT MORE THAN 64 JOBS, MAXIMUM 64 WILL BE RUNNING AT THE SAME TIME'''.
 
 
== Submission tips for the cluster ==
 
 
If your job lasts less than 1 day it doesn't matter in which queue it will end up because no time constraint will be violated.
 
In this case you might want that it gets the first queue available, no matter which. To do so, simply remove the -l queue_name from your script.
 
 
 
== Programming tips for the cluster ==
 
 
If the jobs needs to read/write quite much and often, in the submission script it is better to copy the input files to the /tmp directory (which is in the local harddrive of the node) and to write the output files also there, moving them in the /home/user_name directory only when the computation is over. In this way your job does not have to use NFS for each read/write operation relieving majorana of some weight (the /home partition is exported from there to all the nodes), making it more fast.
 
 
'''REMEMBER TO REMOVE YOUR FILES FROM THE /TMP DIRECTORY ONCE THE COMPUTATION IS OVER'''
 

Latest revision as of 09:21, 8 August 2012