Managing Experiments with Balsam

The Balsam command line

One way to discover the Balsam command line is to look at its help menu by running:

balsam --help

Then the following output should be expected:

usage: balsam [-h]

Balsam command line interface

optional arguments:
-h, --help            show this help message and exit

    app                 add a new application definition
    job                 add a new Balsam job
    dep                 add a dependency between two existing jobs
    ls                  list jobs, applications, or jobs-by-workflow
    modify              alter job or application
    rm                  remove jobs or applications from the database
    killjob             Kill a job without removing it from the DB
    mkchild             Create a child job of a specified job
    launcher            Start a local instance of the balsam launcher
    submit-launch       Submit a launcher job to the batch queue
    init                Create new balsam DB
    service             Start Balsam auto-scheduling service
    which               Get info on current/available DBs
    log                 Quick view of Balsam log files
    server              Control Balsam server at BALSAM_DB_PATH

The Balsam launcher

The Balsam submit launch command

Jobs states

Let’s say we just ran the polynome2 problem from deephyper.benchmark.hps.


As a reminder you can create the job with:

balsam job --name test --application AMBS --workflow TEST --args '--evaluator balsam --problem deephyper.benchmark.hps.polynome2.Problem --run'

And submit the job with:

balsam submit-launch -n 4 -q debug-cache-quad -t 60 -A datascience --job-mode serial --wf-filter TEST

The polynome2 benchmark is a fast way to test the behavior of deephyper.

If we look at the current state of our balsam database with:

balsam ls --wf TEST


If you want to print other fields from Balsam jobs ou can configure the BALSAM_LS_FIELDS env var such as:

export BALSAM_LS_FIELDS=num_nodes:args

We should expect something like this:

8bf2e5a1-ff11-4c96-8f32-f45fbfd80612 | task92  | TEST     | | JOB_FINISHED | 1         | '{"e0": 1, "e1": 9, "e2": 5, "e3": 6, "e4": 2, "e5": 6, "e6": -10, "e7": -6, "e8": 1, "e9": -9}'
3cb219ef-750c-4861-ad7b-441a6e01145d | task93  | TEST     | | JOB_FINISHED | 1         | '{"e0": 5, "e1": -10, "e2": 6, "e3": 8, "e4": 10, "e5": 5, "e6": -10, "e7": 2, "e8": -1, "e9": -7}'
d84ab601-6410-4173-a259-096723ca2e83 | task94  | TEST     | | JOB_FINISHED | 1         | '{"e0": 8, "e1": 9, "e2": 6, "e3": 8, "e4": -10, "e5": 9, "e6": -2, "e7": 10, "e8": 1, "e9": -9}'
65b605f0-98f8-4551-bcca-20a6c41b6be7 | task95  | TEST     | | JOB_FINISHED | 1         | '{"e0": 9, "e1": -5, "e2": 6, "e3": 10, "e4": -8, "e5": -8, "e6": -10, "e7": 0, "e8": 5, "e9": 7}'
4258d34b-8e1c-4a5f-8242-217158ba28a6 | task96  | TEST     | | JOB_FINISHED | 1         | '{"e0": 8, "e1": 9, "e2": 6, "e3": 10, "e4": 1, "e5": -1, "e6": -6, "e7": 3, "e8": 8, "e9": -9}'
72df1858-661a-4a12-901a-40e85c3efe36 | task98  | TEST     | | RESTART_READY | 1         | '{"e0": 10, "e1": -9, "e2": 9, "e3": 8, "e4": 3, "e5": -10, "e6": -8, "e7": -7, "e8": 9, "e9": -2}'
f8314971-c81c-404d-9863-d169ae147f08 | task99  | TEST     | | RESTART_READY | 1         | '{"e0": 10, "e1": 7, "e2": 6, "e3": 8, "e4": -7, "e5": 1, "e6": -9, "e7": 6, "e8": 9, "e9": -6}'
de03094c-df47-4ce6-9426-df3f2e63a40a | test    | TEST     | AMBS                                          | JOB_FINISHED | 1         | --evaluator balsam --problem deephyper.benchmark.hps.polynome2.Problem --run

As you can see some jobs of our TEST workflow are in state RESTART_READY. According to the Balsam documentation the job will be run again if a new launcher with the TEST workflow is executed.

This is why you should delete all jobs of the same workflow if you want to execute the same experiment again. To do so you can use:

balsam rm jobs --name $name | --id $id

For example here I would do:

balsam rm jobs --name test


balsam rm jobs --name task


The --name argument is used to query jobs with a name including it.

You can also use balsam django models directly. First create a new python script:


Then add these code:

import sys
from balsam.launcher.dag import BalsamJob

BalsamJob.objects.filter(name__contains=sys.argv[2], workflow=sys.argv[1]).delete()

then execute:

python TEST task

this previous command will delete all jobs with a name containing task from the TEST workflow. Indeed the previous command balsam rm jobs --name $name was not filtering with respect to a specific workflow. Hence if you have jobs with similar names such as task_$id (generic name for evaluations generated by search algorithms) they will all be deleted.