DeepHyper: Massively Parallel Hyperparameter Optimization for Machine Learning

DeepHyper: Massively Parallel Hyperparameter Optimization for Machine Learning#

DeepHyper is first and foremost a hyperparameter optimization (HPO) library. By leveraging this core HPO functionnality, DeepHyper also provides neural architecture search, multi-fidelity and ensemble capabilities. With DeepHyper, users can easily perform these tasks on a single machine or distributed across multiple machines, making it ideal for use in a variety of environments. Whether you’re a beginner looking to optimize your machine learning models or an experienced data scientist looking to streamline your workflow, DeepHyper has something to offer. So why wait? Start using DeepHyper today and take your machine learning skills to the next level!

The package is organized around the following modules:

  • deephyper.analysis: To analyse your results.

  • deephyper.ensemble: To build ensembles of predictive models possibly with disentangled uncertainty quantification.

  • deephyper.evaluator: To distribute the evaluation of tasks (e.g., training or inference).

  • deephyper.hpo: To perform hyperparameter optimization (HPO) and neural architecture search (NAS).

  • deephyper.predictor: To wrap predictive models from different libraries.

  • deephyper.stopper : To apply multi-fidelity or early discarding strategies for hyperparameter optimization (HPO) and neural architecture search (NAS).

Quick Start#

Install with pip (requires Python >= 3.10):

pip install deephyper

# For the core set of features (Tensorflow/Keras2, Pytorch, Transfer-Learning for HPO and Learning Curve Extrapolation)
pip install "deephyper[core]"

More details about installation can be found on our Installation page.

We then present a simple example of how to use DeepHyper to optimize a black-box function with three hyperparameters: a real-valued parameter, a discrete parameter, and a categorical parameter.

To try this example, you can copy/paste the script and run it.

from deephyper.hpo import HpProblem, CBO
from deephyper.evaluator import Evaluator


def run(job):
    x = job.parameters["x"]
    b = job.parameters["b"]
    function = job.parameters["function"]

    if function == "linear":
        y = x + b
    elif function == "cubic":
        y = x**3 + b

    return y


def optimize():
    problem = HpProblem()
    problem.add_hyperparameter((-10.0, 10.0), "x")
    problem.add_hyperparameter((0, 10), "b")
    problem.add_hyperparameter(["linear", "cubic"], "function")

    evaluator = Evaluator.create(run, method="process",
        method_kwargs={
            "num_workers": 2,
        },
    )

    search = CBO(problem, evaluator, random_state=42)
    results = search.search(max_evals=100)

    return results

if __name__ == "__main__":
    results = optimize()
    print(results)

It will output the following results where the best parameters are with function == "cubic", x == 9.99 and b == 10.

    p:b p:function       p:x    objective  job_id job_status  m:timestamp_submit  m:timestamp_gather
0      3      cubic  8.374450   590.312101       1       DONE            0.013266            1.697188
1      7      cubic -1.103350     5.656803       0       DONE            0.013165            1.697418
2      6      cubic  4.680560   108.540056       2       DONE            1.709580            1.710863
3      9     linear  8.787395    17.787395       3       DONE            1.709704            1.711059
4      2      cubic  4.012429    66.598442       5       DONE            1.721194            1.722261
..   ...        ...       ...          ...     ...        ...                 ...                 ...
96    10      cubic  9.982052  1004.625215      96       DONE           10.093236           10.192950
97    10      cubic  9.999315  1009.794458      97       DONE           10.192616           10.293964
98     4      cubic  9.887916   970.750164      98       DONE           10.293530           10.395159
99    10      cubic  9.986875  1006.067558      99       DONE           10.394701           10.495718
100    9      cubic  9.999787  1008.936159     100       DONE           10.495265           10.595172

Let us now provide step-by-step details about this example.

The black-box function named run (it could be named anything but by convention we call it the run-function) is defined by taking an input job that contains the different hyperparameters to optimize under job.parameters. In our case, the function takes three hyperparameters: x, b, and function. The run-function returns a value y that is computed based on the values of the hyperparameters. The value of y is the objective value that we want to maximize (by convention we do maximization, to do minimization simply return the negative of your objective). The run-function can be any computationally expensive function that you want to optimize. For example, it can be a simple Python execution, opening subprocesses, submitting a SLURM job, perfoming an HTTP request… The search algorithms will learn to optimize the function just based on observed input hyperparameters and output values.

def run(job):
    x = job.parameters["x"]
    b = job.parameters["b"]
    function = job.parameters["function"]

    if function == "linear":
        y = x + b
    elif function == "cubic":
        y = x**3 + b

    return y

Then, we have the def optimize() function that defines the creation and execution of the search.

problem = HpProblem()
problem.add_hyperparameter((-10.0, 10.0), "x")
problem.add_hyperparameter((0, 10), "b")
problem.add_hyperparameter(["linear", "cubic"], "function")

We start by defining the hyperparameter names, types and allowed ranges. For this we create a deephyper.hpo.HpProblem object. We add to this problem three hyperparameters: "x", "b" and "function". The "x" hyperparameter is defined by a continuous range between [-10, 10]. The float type of the bounds is important to infer the continuous type of the hyperparameter. The "b" hyperparameter is defined by an integer range between [0, 10]. Similarly, the int type of the bounds is important to infer the discrete type of the hyperparameter. Finally, the "function" hyperparameter is defined by a list of string values and it is therefore a categorical nominal hyperparameter (i.e., without order relation between its values). The problem can be interactively printed print(problem) to review its definition:

Configuration space object:
    Hyperparameters:
        b, Type: UniformInteger, Range: [0, 10], Default: 5
        function, Type: Categorical, Choices: {linear, cubic}, Default: linear
        x, Type: UniformFloat, Range: [-10.0, 10.0], Default: 0.0

After the problem, we create a deephyper.evaluator.Evaluator object. The Evaluator is in charge of asynchronously distributing the computation of multiple calls to the run-function. It provides a simple interface Evaluator.submit(tasks) and tasks_done = Evaluator.gather() to perform asynchronous calls to the run-function.

evaluator = Evaluator.create(
    run,
    method="process",
    method_kwargs={
        "num_workers": 2,
    },
)

The method="process" let us choose among available parallel backends. In this example, we picked the "process" method. The method_kwargs let us configure the Evaluator a bit more. The keys of method_kwargs directly match the possible argument of the corresponding Evaluator subclass. Some of these arguments are common to all Evaluator methods and some are specific. The deephyper.evaluator API reference can be used to review the available arguments of each subclass. In our case, method="process" corresponds to the deephyper.evaluator.ProcessPoolEvaluator. The only argument set is "num_workers": 2 to define two process-based workers for our Evaluator allowing 2 parallel calls to run-function on a CPU with at least two hardware threads.

Finally, comes the last piece of the puzzle. We create a deephyper.search.CBO object for a Centralized Bayesian optimization with the problem and evaluator created previously. All search methods are sublcasses of deephyper.hpo.Search. For reproducibility of this example we also set the random_state=42. Then, we execute the search by using the max_evals termination criterion to stop the search when max_evals results have been gathered.

search = CBO(problem, evaluator, random_state=42)
results = search.search(max_evals=100)

The returned results is a Pandas DataFrame object that is also checkpointed locally in the current directory under results.csv (default value of the log_dir="." argument of Search subclasses). This DataFrame contains 1 row per run-function evaluation:

  • the columns that start with p: are the hyperparameters.

  • the objective is the returned values of the run-function.

  • the job_id is the Evaluator job id of the evaluation (an integer incremented by order of job creation).

  • the job_status is the Evaluator job status of the evaluation.

  • the columns that stat with m: are metadata of each evaluations. Some are added by DeepHyper but they can also be returned by the user as part of the run-function returned value.

    p:b p:function       p:x    objective  job_id job_status  m:timestamp_submit  m:timestamp_gather
0      3      cubic  8.374450   590.312101       1       DONE            0.013266            1.697188
1      7      cubic -1.103350     5.656803       0       DONE            0.013165            1.697418
2      6      cubic  4.680560   108.540056       2       DONE            1.709580            1.710863
3      9     linear  8.787395    17.787395       3       DONE            1.709704            1.711059
4      2      cubic  4.012429    66.598442       5       DONE            1.721194            1.722261
..   ...        ...       ...          ...     ...        ...                 ...                 ...
96    10      cubic  9.982052  1004.625215      96       DONE           10.093236           10.192950
97    10      cubic  9.999315  1009.794458      97       DONE           10.192616           10.293964
98     4      cubic  9.887916   970.750164      98       DONE           10.293530           10.395159
99    10      cubic  9.986875  1006.067558      99       DONE           10.394701           10.495718
100    9      cubic  9.999787  1008.936159     100       DONE           10.495265           10.595172

Warning

By convention in DeepHyper, all search algorithms are MAXIMIZING the objective function. If you want to MINIMIZE the objective function, you can simply return the negative of your objective value.

The next steps to learn more about DeepHyper is to follow our Tutorials and Examples.

Table of Contents#

Indices and tables#