Difference between revisions of "Tuning your algorithm with irace on the IRIDIA cluster"

From IridiaWiki
Jump to navigationJump to search
Line 71: Line 71:
 
</pre>
 
</pre>
   
in tune-mpi add in the qsub shell script before the MPIRUN command
+
in tune-main-cluster-mpi add in the qsub shell script before the MPIRUN command
   
 
<pre>
 
<pre>
Line 88: Line 88:
 
<pre>
 
<pre>
 
$ cat > parameters.txt
 
$ cat > parameters.txt
dummy_par "--whatever" i (1, 100)
+
dummy_par "--whatever " i (1, 100)
 
CTRL+d
 
CTRL+d
 
</pre>
 
</pre>
Line 103: Line 103:
   
 
take a look at tune-main-cluster-mpi and change cluster queues and qsub parameters to better suit your needs.
 
take a look at tune-main-cluster-mpi and change cluster queues and qsub parameters to better suit your needs.
  +
If you have issues with your code or irace try to run it with:
  +
  +
<pre>
  +
$ ./tune-main-cluster-mpi $IRACE_HOME/bin temp --parallel 10 --debug-level 1
  +
</pre>
   
 
You can check if the job is waiting, running, or complete with the qstat command. In the directory ~/tuning/temp you will find an irace-$PID.stdout and an irace-$PID.stderr file.
 
You can check if the job is waiting, running, or complete with the qstat command. In the directory ~/tuning/temp you will find an irace-$PID.stdout and an irace-$PID.stderr file.
  +
In the stdout file you should have an output like the one below:
  +
  +
<pre>
  +
-catch_rsh /opt/gridengine/default/spool/compute-3-14/active_jobs/7718068.1/pe_hostfile
  +
compute-3-14
  +
compute-3-14
  +
compute-3-14
  +
compute-3-14
  +
compute-3-14
  +
compute-3-14
  +
compute-3-14
  +
compute-3-14
  +
compute-3-14
  +
compute-3-14
  +
compute-3-14
  +
irace version 1.0.560
  +
  +
irace: An implementation in R of Iterated Race
  +
Copyright (C) 2010, 2011
  +
Manuel Lopez-Ibanez <manuel.lopez-ibanez@ulb.ac.be>
  +
Jeremie Dubois-Lacoste <jeremie.dubois-lacoste@ulb.ac.be>
  +
  +
This is free software, and you are welcome to redistribute it under certain
  +
conditions. See the GNU General Public License for details. There is NO
  +
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  +
  +
Warning: A default configuration file ' ./tune-conf ' has been found and will be read
  +
Note: Reading configuration file ' ./tune-conf '....... done!
  +
### CONFIGURATION STATE TO BE USED
  +
configurationFile <- "./tune-conf"
  +
parameterFile <- "/home/mascia/tuning/./parameters.txt"
  +
execDir <- "temp"
  +
logFile <- "./irace.Rdata"
  +
instances <- "/home/mascia/tuning/./Instances//1"
  +
instanceDir <- "/home/mascia/tuning/./Instances"
  +
instanceFile <- ""
  +
candidatesFile <- ""
  +
hookRun <- "/home/mascia/tuning/./hook-run"
  +
expName <- "Experiment Name"
  +
expDescription <- "Experiment Description"
  +
maxExperiments <- 1000
  +
timeBudget <- 0
  +
timeEstimate <- 0
  +
digits <- 4
  +
debugLevel <- 0
  +
nbIterations <- 0
  +
nbExperimentsPerIteration <- 0
  +
sampleInstances <- TRUE
  +
testType <- "friedman"
  +
firstTest <- 5
  +
eachTest <- 1
  +
minNbSurvival <- 0
  +
nbCandidates <- 0
  +
mu <- 5
  +
seed <- "NA"
  +
parallel <- 10
  +
sgeCluster <- FALSE
  +
mpi <- TRUE
  +
softRestart <- TRUE
  +
### end of configuration
  +
# 2012-03-02 14:41:13 CET: INITIALIZATION
  +
# nbIterations: 2
  +
# minSurvival: 2
  +
# nbParameters: 1
  +
# Seed: 1110701261
  +
# 2012-03-02 14:41:13 CET: ITERATION 1 of 2
  +
# experimentsUsedSoFar: 0
  +
# timeUsedSoFar: 0
  +
# timeEstimate: 0
  +
# remainingBudget: 1000
  +
# currentBudget: 500
  +
# nbCandidates: 83
  +
  +
Racing methods for the selection of the best
  +
Copyright (C) 2003 Mauro Birattari
  +
This software comes with ABSOLUTELY NO WARRANTY
  +
  +
Race name: Experiment Name
  +
Number of candidates: 83
  +
Number of available tasks: 1000
  +
Max number of experiments: 500
  +
Statistical test: Friedman test
  +
Tasks seen before discarding: 5
  +
Initialization function: ok
  +
  +
Experiment Description
  +
  +
  +
Markers:
  +
x No test is performed.
  +
- The test is performed and
  +
some candidates are discarded.
  +
= The test is performed but
  +
no candidate is discarded.
  +
  +
  +
+-+-----------+-----------+-----------+-----------+-----------+
  +
| | Task| Alive| Best| Mean best| Exp so far|
  +
+-+-----------+-----------+-----------+-----------+-----------+
  +
10 slaves are spawned successfully. 0 failed.
  +
master (rank 0 , comm 1) of size 11 is running on: compute-3-14
  +
slave1 (rank 1 , comm 1) of size 11 is running on: compute-3-14
  +
slave2 (rank 2 , comm 1) of size 11 is running on: compute-3-14
  +
slave3 (rank 3 , comm 1) of size 11 is running on: compute-3-14
  +
slave4 (rank 4 , comm 1) of size 11 is running on: compute-3-14
  +
slave5 (rank 5 , comm 1) of size 11 is running on: compute-3-14
  +
slave6 (rank 6 , comm 1) of size 11 is running on: compute-3-14
  +
slave7 (rank 7 , comm 1) of size 11 is running on: compute-3-14
  +
slave8 (rank 8 , comm 1) of size 11 is running on: compute-3-14
  +
slave9 (rank 9 , comm 1) of size 11 is running on: compute-3-14
  +
slave10 (rank 10, comm 1) of size 11 is running on: compute-3-14
  +
|x| 1| 83| 7| 1| 83|
  +
|x| 2| 83| 7| 1| 166|
  +
|x| 3| 83| 7| 1| 249|
  +
|x| 4| 83| 7| 1| 332|
  +
|-| 5| 1| 7| 1| 415|
  +
+-+-----------+-----------+-----------+-----------+-----------+
  +
  +
Selected candidate: 7 mean value: 1
  +
  +
Description of the selected candidate:
  +
[1] 1
  +
  +
  +
# Elite candidates:
  +
dummy_par
  +
7 1
  +
# 2012-03-02 14:41:17 CET: ITERATION 2 of 2
  +
# experimentsUsedSoFar: 415
  +
# timeUsedSoFar: 0
  +
# timeEstimate: 0
  +
# remainingBudget: 585
  +
# currentBudget: 585
  +
# nbCandidates: 83
  +
# Computing similarity of candidates .................................................................................. DONE
  +
# 2012-03-02 14:41:21 CET: Soft restart: 7 85 86 87 88 89 90 91 92 93 94 95 96 98 100 102 103 104 105 108 109 110 111 112 113 115 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 135 136 138 139 140 142 145 146 147 148 149 150 152 153 154 155 156 158 159 160 161 163 164 84 97 99 101 106 107 114 116 134 137 141 143 144 151 157 162 165 !
  +
  +
Racing methods for the selection of the best
  +
Copyright (C) 2003 Mauro Birattari
  +
This software comes with ABSOLUTELY NO WARRANTY
  +
  +
Race name: Experiment Name
  +
Number of candidates: 83
  +
Number of available tasks: 1000
  +
Max number of experiments: 585
  +
Statistical test: Friedman test
  +
Tasks seen before discarding: 5
  +
Initialization function: ok
  +
  +
Experiment Description
  +
  +
  +
Markers:
  +
x No test is performed.
  +
- The test is performed and
  +
some candidates are discarded.
  +
= The test is performed but
  +
no candidate is discarded.
  +
  +
  +
+-+-----------+-----------+-----------+-----------+-----------+
  +
| | Task| Alive| Best| Mean best| Exp so far|
  +
+-+-----------+-----------+-----------+-----------+-----------+
  +
|x| 1| 83| 1| 1| 83|
  +
|x| 2| 83| 1| 1| 166|
  +
|x| 3| 83| 1| 1| 249|
  +
|x| 4| 83| 1| 1| 332|
  +
|-| 5| 59| 1| 1| 415|
  +
|=| 6| 59| 1| 1| 474|
  +
|=| 7| 59| 1| 1| 533|
  +
+-+-----------+-----------+-----------+-----------+-----------+
  +
  +
Selected candidate: 1 mean value: 1
  +
  +
Description of the selected candidate:
  +
[1] 1
  +
  +
  +
# Elite candidates:
  +
dummy_par
  +
7 1
  +
84 1
  +
# 2012-03-02 14:41:25 CET: Limit of iterations reached
  +
# 2012-03-02 14:41:25 CET: ITERATION 3 of 3
  +
# experimentsUsedSoFar: 948
  +
# timeUsedSoFar: 0
  +
# timeEstimate: 0
  +
# remainingBudget: 52
  +
# currentBudget: 52
  +
# nbCandidates: 6
  +
# Computing similarity of candidates ..... DONE
  +
# 2012-03-02 14:41:25 CET: Soft restart: 7 84 166 167 168 169 !
  +
  +
Racing methods for the selection of the best
  +
Copyright (C) 2003 Mauro Birattari
  +
This software comes with ABSOLUTELY NO WARRANTY
  +
  +
Race name: Experiment Name
  +
Number of candidates: 6
  +
Number of available tasks: 1000
  +
Max number of experiments: 52
  +
Statistical test: Friedman test
  +
Tasks seen before discarding: 5
  +
Initialization function: ok
  +
  +
Experiment Description
  +
  +
  +
Markers:
  +
x No test is performed.
  +
- The test is performed and
  +
some candidates are discarded.
  +
= The test is performed but
  +
no candidate is discarded.
  +
  +
  +
+-+-----------+-----------+-----------+-----------+-----------+
  +
| | Task| Alive| Best| Mean best| Exp so far|
  +
+-+-----------+-----------+-----------+-----------+-----------+
  +
|x| 1| 6| 1| 1| 6|
  +
|x| 2| 6| 1| 1| 12|
  +
|x| 3| 6| 1| 1| 18|
  +
|x| 4| 6| 1| 1| 24|
  +
|-| 5| 2| 1| 1| 30|
  +
+-+-----------+-----------+-----------+-----------+-----------+
  +
  +
Selected candidate: 1 mean value: 1
  +
  +
Description of the selected candidate:
  +
[1] 1
  +
  +
  +
# Elite candidates:
  +
dummy_par
  +
7 1
  +
84 1
  +
# 2012-03-02 14:41:25 CET: Limit of iterations reached
  +
# 2012-03-02 14:41:25 CET: Stopped because there is no enough budget to sample new candidates
  +
# number of elites: 2
  +
# indexIteration: 4
  +
# mu: 5
  +
# nbIterations: 4
  +
# experimentsUsedSoFar: 978
  +
# timeUsedSoFar: 0
  +
# timeEstimate: 0
  +
# remainingBudget: 22
  +
# currentBudget: 22
  +
# nbCandidates: 2
  +
# Best candidates
  +
dummy_par
  +
7 1
  +
84 1
  +
# Best candidates (as commandlines)
  +
command
  +
7 --whatever 1
  +
84 --whatever 1
  +
# Finalize MPI...
  +
[1] "Exiting Rmpi. Rmpi cannot be used unless relaunching R."
  +
</pre>

Revision as of 14:43, 2 March 2012

You should really read the README file, and take a look on the examples and templates in the irace directory. Or you can just follow this walk-through for a quick start but you will miss many options and configuration possibilities. You can start from this example and then adapt it to the algorithm you want to configure.

Installation

First of all, you have to install the irace R package on majorana:

$ ssh majorana
$ export R_LIBS=~/R:${R_LIBS}
$ R
> install.packages("multicore")
> install.packages("irace")

select the belgian mirror and test the installation with

> library(irace)
> CTRL+d

Once installed, exit R, and add at the end of your .bash_profile or .bashrc or .profile:

export R_LIBS=~/R:${R_LIBS}
export IRACE_HOME=~/R/2.14/library/irace/
# export PATH=$IRACE_HOME/bin/:$PATH

The algorithm to be tuned

You have to create a directory where you do the tuning

$ mkdir ~/tuning
$ cd ~/tuning

you copy here the program you are tuning, in this case it's just a simple C program

$ cat > algo.c
#include <stdio.h>
int main(int argc, char **argv) {
    // call me with ./algo -i instance --whatever <integer_parameter>
    printf("Best %d\n", atoi(argv[4]));
}
CTRL+d
$ make algo
cc     algo.c   -o algo

Prepare for the tuning

Copy some of the template and example files in the current (tuning) directory

$ cd ~/tuning
$ mkdir temp
$ cp $IRACE_HOME/templates/tune-conf.tmpl tune-conf
$ cp $IRACE_HOME/examples/mpi/tune-main-cluster-mpi tune-main-cluster-mpi
$ cp $IRACE_HOME/examples/acotsp/hook-run .

in hook-run change the two environment variables like below

EXE=~/tuning/algo
FIXED_PARAMS=""

in tune-main-cluster-mpi add in the qsub shell script before the MPIRUN command

export R_LIBS=~/R:${R_LIBS}

create some dummy instances

$ mkdir Instances
$ for i in {1..100}; do touch Instances/$i; done

create a parameters file

$ cat > parameters.txt
dummy_par        "--whatever "        i   (1, 100)
CTRL+d

you should look in the examples/acotsp directory for a more complete example...

Tuning time!

Now you are ready to run irace:

$ ./tune-main-cluster-mpi $IRACE_HOME/bin temp --parallel 10

take a look at tune-main-cluster-mpi and change cluster queues and qsub parameters to better suit your needs. If you have issues with your code or irace try to run it with:

$ ./tune-main-cluster-mpi $IRACE_HOME/bin temp --parallel 10 --debug-level 1

You can check if the job is waiting, running, or complete with the qstat command. In the directory ~/tuning/temp you will find an irace-$PID.stdout and an irace-$PID.stderr file. In the stdout file you should have an output like the one below:

-catch_rsh /opt/gridengine/default/spool/compute-3-14/active_jobs/7718068.1/pe_hostfile
compute-3-14
compute-3-14
compute-3-14
compute-3-14
compute-3-14
compute-3-14
compute-3-14
compute-3-14
compute-3-14
compute-3-14
compute-3-14
irace	version 1.0.560 

irace: An implementation in R of Iterated Race
Copyright (C) 2010, 2011
Manuel Lopez-Ibanez     <manuel.lopez-ibanez@ulb.ac.be>
Jeremie Dubois-Lacoste  <jeremie.dubois-lacoste@ulb.ac.be>

This is free software, and you are welcome to redistribute it under certain
conditions.  See the GNU General Public License for details. There is NO   
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Warning: A default configuration file ' ./tune-conf ' has been found and will be read
Note: Reading configuration file ' ./tune-conf '....... done!
###   CONFIGURATION STATE TO BE USED
configurationFile <- "./tune-conf" 
parameterFile <- "/home/mascia/tuning/./parameters.txt" 
execDir <- "temp" 
logFile <- "./irace.Rdata" 
instances <- "/home/mascia/tuning/./Instances//1" 
instanceDir <- "/home/mascia/tuning/./Instances" 
instanceFile <- "" 
candidatesFile <- "" 
hookRun <- "/home/mascia/tuning/./hook-run" 
expName <- "Experiment Name" 
expDescription <- "Experiment Description" 
maxExperiments <- 1000 
timeBudget <- 0 
timeEstimate <- 0 
digits <- 4 
debugLevel <- 0 
nbIterations <- 0 
nbExperimentsPerIteration <- 0 
sampleInstances <- TRUE 
testType <- "friedman" 
firstTest <- 5 
eachTest <- 1 
minNbSurvival <- 0 
nbCandidates <- 0 
mu <- 5 
seed <- "NA" 
parallel <- 10 
sgeCluster <- FALSE 
mpi <- TRUE 
softRestart <- TRUE 
### end of configuration
# 2012-03-02 14:41:13 CET: INITIALIZATION 
# nbIterations: 2
# minSurvival: 2
# nbParameters: 1
# Seed: 1110701261
# 2012-03-02 14:41:13 CET: ITERATION 1 of 2
# experimentsUsedSoFar: 0
# timeUsedSoFar: 0
# timeEstimate: 0
# remainingBudget: 1000
# currentBudget: 500
# nbCandidates: 83

Racing methods for the selection of the best
Copyright (C) 2003 Mauro Birattari
This software comes with ABSOLUTELY NO WARRANTY

Race name:                                                      Experiment Name 
Number of candidates:                                                        83 
Number of available tasks:                                                 1000 
Max number of experiments:                                                  500 
Statistical test:                                                 Friedman test 
Tasks seen before discarding:                                                 5 
Initialization function:                                                     ok 
 
 	Experiment Description 
 

                            Markers:                           
                               x No test is performed.         
                               - The test is performed and     
                                 some candidates are discarded.
                               = The test is performed but     
                                 no candidate is discarded.    
                                                               
                                                               
+-+-----------+-----------+-----------+-----------+-----------+
| |       Task|      Alive|       Best|  Mean best| Exp so far|
+-+-----------+-----------+-----------+-----------+-----------+
	10 slaves are spawned successfully. 0 failed.
master  (rank 0 , comm 1) of size 11 is running on: compute-3-14 
slave1  (rank 1 , comm 1) of size 11 is running on: compute-3-14 
slave2  (rank 2 , comm 1) of size 11 is running on: compute-3-14 
slave3  (rank 3 , comm 1) of size 11 is running on: compute-3-14 
slave4  (rank 4 , comm 1) of size 11 is running on: compute-3-14 
slave5  (rank 5 , comm 1) of size 11 is running on: compute-3-14 
slave6  (rank 6 , comm 1) of size 11 is running on: compute-3-14 
slave7  (rank 7 , comm 1) of size 11 is running on: compute-3-14 
slave8  (rank 8 , comm 1) of size 11 is running on: compute-3-14 
slave9  (rank 9 , comm 1) of size 11 is running on: compute-3-14 
slave10 (rank 10, comm 1) of size 11 is running on: compute-3-14 
|x|          1|         83|          7|          1|         83|
|x|          2|         83|          7|          1|        166|
|x|          3|         83|          7|          1|        249|
|x|          4|         83|          7|          1|        332|
|-|          5|          1|          7|          1|        415|
+-+-----------+-----------+-----------+-----------+-----------+

Selected candidate:           7	mean value:          1

Description of the selected candidate:
[1] 1


# Elite candidates:
  dummy_par
7         1
# 2012-03-02 14:41:17 CET: ITERATION 2 of 2
# experimentsUsedSoFar: 415
# timeUsedSoFar: 0
# timeEstimate: 0
# remainingBudget: 585
# currentBudget: 585
# nbCandidates: 83
# Computing similarity of candidates .................................................................................. DONE
# 2012-03-02 14:41:21 CET: Soft restart: 7 85 86 87 88 89 90 91 92 93 94 95 96 98 100 102 103 104 105 108 109 110 111 112 113 115 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 135 136 138 139 140 142 145 146 147 148 149 150 152 153 154 155 156 158 159 160 161 163 164 84 97 99 101 106 107 114 116 134 137 141 143 144 151 157 162 165 !

Racing methods for the selection of the best
Copyright (C) 2003 Mauro Birattari
This software comes with ABSOLUTELY NO WARRANTY

Race name:                                                      Experiment Name 
Number of candidates:                                                        83 
Number of available tasks:                                                 1000 
Max number of experiments:                                                  585 
Statistical test:                                                 Friedman test 
Tasks seen before discarding:                                                 5 
Initialization function:                                                     ok 
 
 	Experiment Description 
 

                            Markers:                           
                               x No test is performed.         
                               - The test is performed and     
                                 some candidates are discarded.
                               = The test is performed but     
                                 no candidate is discarded.    
                                                               
                                                               
+-+-----------+-----------+-----------+-----------+-----------+
| |       Task|      Alive|       Best|  Mean best| Exp so far|
+-+-----------+-----------+-----------+-----------+-----------+
|x|          1|         83|          1|          1|         83|
|x|          2|         83|          1|          1|        166|
|x|          3|         83|          1|          1|        249|
|x|          4|         83|          1|          1|        332|
|-|          5|         59|          1|          1|        415|
|=|          6|         59|          1|          1|        474|
|=|          7|         59|          1|          1|        533|
+-+-----------+-----------+-----------+-----------+-----------+

Selected candidate:           1	mean value:          1

Description of the selected candidate:
[1] 1


# Elite candidates:
   dummy_par
7          1
84         1
# 2012-03-02 14:41:25 CET: Limit of iterations reached
# 2012-03-02 14:41:25 CET: ITERATION 3 of 3
# experimentsUsedSoFar: 948
# timeUsedSoFar: 0
# timeEstimate: 0
# remainingBudget: 52
# currentBudget: 52
# nbCandidates: 6
# Computing similarity of candidates ..... DONE
# 2012-03-02 14:41:25 CET: Soft restart: 7 84 166 167 168 169 !

Racing methods for the selection of the best
Copyright (C) 2003 Mauro Birattari
This software comes with ABSOLUTELY NO WARRANTY

Race name:                                                      Experiment Name 
Number of candidates:                                                         6 
Number of available tasks:                                                 1000 
Max number of experiments:                                                   52 
Statistical test:                                                 Friedman test 
Tasks seen before discarding:                                                 5 
Initialization function:                                                     ok 
 
 	Experiment Description 
 

                            Markers:                           
                               x No test is performed.         
                               - The test is performed and     
                                 some candidates are discarded.
                               = The test is performed but     
                                 no candidate is discarded.    
                                                               
                                                               
+-+-----------+-----------+-----------+-----------+-----------+
| |       Task|      Alive|       Best|  Mean best| Exp so far|
+-+-----------+-----------+-----------+-----------+-----------+
|x|          1|          6|          1|          1|          6|
|x|          2|          6|          1|          1|         12|
|x|          3|          6|          1|          1|         18|
|x|          4|          6|          1|          1|         24|
|-|          5|          2|          1|          1|         30|
+-+-----------+-----------+-----------+-----------+-----------+

Selected candidate:           1	mean value:          1

Description of the selected candidate:
[1] 1


# Elite candidates:
   dummy_par
7          1
84         1
# 2012-03-02 14:41:25 CET: Limit of iterations reached
# 2012-03-02 14:41:25 CET: Stopped because there is no enough budget to sample new candidates
# number of elites: 2
# indexIteration: 4
# mu: 5
# nbIterations: 4
# experimentsUsedSoFar: 978
# timeUsedSoFar: 0
# timeEstimate: 0
# remainingBudget: 22
# currentBudget: 22
# nbCandidates: 2
# Best candidates
   dummy_par
7          1
84         1
# Best candidates (as commandlines)
         command
7   --whatever 1
84  --whatever 1
# Finalize MPI...
[1] "Exiting Rmpi. Rmpi cannot be used unless relaunching R."