NERSCPowering Scientific Discovery Since 1974

Spearmint - Bayesian Hyperparameter Optimization

Spearmint is a Python Bayesian optimization codebase.

 

Using Spearmint

module load spearmint

spearmint -c path/to/config.json

config.json will contain information about where your code is and what hyperparameter ranges to use. It is described in more detail below

Multinode Spearmint

    • set  "max-concurrent" in your config.json file appropriately (almost always as 1, but see below
    • pass the "--srun" flag to spearmint, like so
    • optionally you can specify the number of nodes with -n <number of nodes> (the default is the number if nodes in your allocation

Example Batch Script

#!/bin/bash -l
#SBATCH -N 15
#SBATCH -t 08:00:00
#SBATCH -p regular
#SBATCH -C haswell

module load spearmint
# the -n argument is optional.
# By default Spearmint will run on the number of nodes in your allocation
spearmint -c path/to/config.json --srun -n 15

Writing your config.json File

Example config.json

{
    "language"        : "PYTHON",
    "experiment-name" : "any name you want",
    "polling-time"    : 1,
    "resources" : {
        "my-machine" : {
            "scheduler"         : "local",
            "max-concurrent"    : 1,
            "max-finished-jobs" : 100
        },
    },
    "tasks": {
        "branin" : {
            "type"       : "OBJECTIVE",
            "likelihood" : "NOISELESS",
            "main-file"  : "your_python_script_name",
            "resources"  : ["my-machine"]
        }
    },
    "variables": {
        "hyperparameter1_name" : {
            "type" : "FLOAT",
            "size" : 1,
            "min"  : -5,
            "max"  : 10
        },
        "hyperparameter2_name" : {
            "type" : "FLOAT",
            "size" : 1,
            "min"  : 0,
            "max"  : 15
        }
    }
}

More config.json Tips

  • Make sure your "main-file" python script is in the same directory of your config.json
  • Inside your "main-file" define a function called main that
    • takes as its first argument a job_id and its second argument a python dictionary, which contains key, value pairs for each hyperparameter described as a variable in your config.json
    • returns the value you are trying to minimize
  • See examples of main-files and config.jsons

How to Set "max-concurrent"

  • To run many single core python processes on a single node
    • set "max-concurrent" to about 30-32 for Cori (only do this if you are you sure each function call uses only one core)
    • Do NOT set "max-concurrent" to more than 1 if
      • You are using any of the deep learning libraries (they are all inherently multicore)
      • You are doing large matrix operations with numpy (it is linked with mkl)
      • You are using a scikit-learn function where you have specified n_jobs to -1
      • One way to check is to run one instance of your function and check the CPU% used using top (if it is more than 100% then it is probably best to set "max_concurrent" to 1)

More Information

  • Spearmint only works for Python code
  • For more information see the Spearmint github page
  • Spearmint also has a mode whereby each hyperparameter setting is a separate job automatically submitted to the queue with sbatch. That mode is currently not supported at NERSC

Availability

PackagePlatformCategoryVersionModuleInstall DateDate Made Default
spearmint cori applications/ programming 0.1 spearmint/0.1 2016-02-09 2016-07-03