5. An Introduction to Black-Box Mutli-Objective Optimization with DeepHyper#
In this tutorial, we will explore how to run black-box multi-objective optimization (MOO). In this setting, the goal is to resolve the following problem:
where \(x\) is the set of optimized variables and \(f_i\) are the different objectives. In DeepHyper, we use scalarization to transform such multi-objective problem into a single-objective problem:
where \(w\) is a set of weights which manages the trade-off between objectives and \(s_w : \mathbb{R}^n \rightarrow \mathbb{R}\). The weight vector \(w\) is randomized and re-sampled for each new batch of suggestion from the optimizer.
[1]:
# Installing DeepHyper if not present
try:
import deephyper
print(deephyper.__version__)
except (ImportError, ModuleNotFoundError):
!pip install deephyper
import deephyper
print(deephyper.__version__)
# Installing DeepHyper/Benchmark if not present
try:
import deephyper_benchmark as dhb
except (ImportError, ModuleNotFoundError):
!pip install -e "git+https://github.com/deephyper/benchmark.git@main#egg=deephyper-benchmark"
# Installing DeepHyper/Benchmark if not present
try:
import deephyper_benchmark as dhb
except (ImportError, ModuleNotFoundError):
!pip install -e "git+https://github.com/deephyper/benchmark.git@main#egg=deephyper-benchmark"
0.9.0
We will look at the DTLZ benchmark suite, a classic in multi-objective optimization (MOO) litterature. This benchmark exibit some characteristic cases of MOO. By default, this tutorial is loading the DTLZ-II benchmark which exibit a Pareto-Front with a concave shape.
[2]:
import os
n_objectives = 2
# Configuration of the DTLZ Benchmark
os.environ["DEEPHYPER_BENCHMARK_DTLZ_PROB"] = str(2)
os.environ["DEEPHYPER_BENCHMARK_NDIMS"] = str(8)
os.environ["DEEPHYPER_BENCHMARK_NOBJS"] = str(n_objectives)
os.environ["DEEPHYPER_BENCHMARK_DTLZ_OFFSET"] = str(0.6)
os.environ["DEEPHYPER_BENCHMARK_FAILURES"] = str(0)
# Loading the DTLZ Benchmark
import deephyper_benchmark as dhb; dhb.load("DTLZ");
from deephyper_benchmark.lib.dtlz import hpo, metrics
We can display the variable search space of the benchmark we just loaded:
[3]:
hpo.problem
[3]:
Configuration space object:
Hyperparameters:
x0, Type: UniformFloat, Range: [0.0, 1.0], Default: 0.5
x1, Type: UniformFloat, Range: [0.0, 1.0], Default: 0.5
x2, Type: UniformFloat, Range: [0.0, 1.0], Default: 0.5
x3, Type: UniformFloat, Range: [0.0, 1.0], Default: 0.5
x4, Type: UniformFloat, Range: [0.0, 1.0], Default: 0.5
x5, Type: UniformFloat, Range: [0.0, 1.0], Default: 0.5
x6, Type: UniformFloat, Range: [0.0, 1.0], Default: 0.5
x7, Type: UniformFloat, Range: [0.0, 1.0], Default: 0.5
To define a black-box for multi-objective optimization it is very similar to single-objective optimization at the difference that the objective
can now be a list of values. A first possibility is:
def run(job):
...
return objective_0, objective_1, ..., objective_n
which just returns the objectives to optimize as a tuple. If additionnal metadata are interesting to gather for each evaluation it is also possible to return them by following this format:
def run(job):
...
return {
"objective": [objective_0, objective_1, ..., objective_n],
"metadata": {
"flops": ...,
"memory_footprint": ...,
"duration": ...,
}
}
each of the metadata needs to be JSON serializable and will be returned in the final results with a column name formatted as m:metadata_key
such as m:duration
.
Now we can load Centralized Bayesian Optimization search:
[4]:
from deephyper.hpo import CBO
from deephyper.evaluator import Evaluator
from deephyper.evaluator.callback import TqdmCallback
[5]:
# Interface to submit/gather parallel evaluations of the black-box function.
# The method argument is used to specify the parallelization method, in our case we use threads.
# The method_kwargs argument is used to specify the number of workers and the callbacks.
# The TqdmCallback is used to display a progress bar during the search.
evaluator = Evaluator.create(
hpo.run,
method="thread",
method_kwargs={"num_workers": 4, "callbacks": [TqdmCallback()]},
)
# Search algorithm
# The acq_func argument is used to specify the acquisition function.
# The multi_point_strategy argument is used to specify the multi-point strategy,
# in our case we use qUCB instead of the default cl_max (constant-liar) to reduce overheads.
# The update_prior argument is used to specify whether the sampling-prior should
# be updated during the search.
# The update_prior_quantile argument is used to specify the quantile of the lower-bound
# used to update the sampling-prior.
# The moo_scalarization_strategy argument is used to specify the scalarization strategy.
# Chebyshev is capable of generating a diverse set of solutions for non-convex problems.
# The moo_scalarization_weight argument is used to specify the weight of the scalarization.
# random is used to generate a random weight vector for each iteration.
search = CBO(
hpo.problem,
evaluator,
acq_func="UCBd",
multi_point_strategy="qUCB",
acq_optimizer="ga",
acq_optimizer_freq=1,
moo_scalarization_strategy="AugChebyshev",
moo_scalarization_weight="random",
objective_scaler="identity",
n_jobs=-1,
verbose=1,
)
# Launch the search for a given number of evaluations
# other stopping criteria can be used (e.g. timeout, early-stopping/convergence)
results = search.search(max_evals=500)
/Users/romainegele/Documents/Argonne/deephyper/src/deephyper/evaluator/_evaluator.py:148: UserWarning: Applying nest-asyncio patch for IPython Shell!
warnings.warn("Applying nest-asyncio patch for IPython Shell!", category=UserWarning)
WARNING:root:Results file already exists, it will be renamed to /Users/romainegele/Documents/Argonne/deephyper-tutorials/tutorials/colab/results_20241216-173116.csv
A Pandas table of results is returned by the search and also saved at ./results.csv
. An other location can be specified by using CBO(..., log_dir=...)
.
[6]:
results
[6]:
p:x0 | p:x1 | p:x2 | p:x3 | p:x4 | p:x5 | p:x6 | p:x7 | objective_0 | objective_1 | job_id | job_status | m:timestamp_submit | m:timestamp_start | m:timestamp_end | m:timestamp_gather | pareto_efficient | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.733577 | 0.996025 | 0.427480 | 0.250198 | 0.970136 | 0.922925 | 0.619021 | 0.624698 | -0.630393 | -1.417346 | 0 | DONE | 0.066479 | 1.734367e+09 | 1.734367e+09 | 0.067478 | False |
1 | 0.141318 | 0.984725 | 0.242143 | 0.463537 | 0.410258 | 0.480738 | 0.917900 | 0.155808 | -1.602968 | -0.361791 | 3 | DONE | 0.066574 | 1.734367e+09 | 1.734367e+09 | 0.071181 | False |
2 | 0.284300 | 0.030004 | 0.585472 | 0.062159 | 0.600611 | 0.570432 | 0.525573 | 0.683184 | -1.468083 | -0.702979 | 1 | DONE | 0.066551 | 1.734367e+09 | 1.734367e+09 | 0.071348 | False |
3 | 0.631827 | 0.430463 | 0.940262 | 0.217656 | 0.761729 | 0.294427 | 0.053316 | 0.617908 | -0.934408 | -1.431436 | 2 | DONE | 0.066565 | 1.734367e+09 | 1.734367e+09 | 0.071876 | False |
4 | 0.694170 | 0.990872 | 0.667381 | 0.486913 | 0.176842 | 0.024297 | 0.463149 | 0.545284 | -0.786698 | -1.509645 | 4 | DONE | 0.118224 | 1.734367e+09 | 1.734367e+09 | 0.118815 | False |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
498 | 0.750795 | 0.035014 | 0.000138 | 0.734407 | 0.967597 | 0.994578 | 0.008805 | 0.615059 | -0.891888 | -2.160836 | 498 | DONE | 651.110875 | 1.734367e+09 | 1.734367e+09 | 651.111945 | False |
499 | 0.765154 | 0.012457 | 0.018812 | 0.990145 | 0.999928 | 0.904044 | 0.003490 | 0.604540 | -0.881065 | -2.279051 | 499 | DONE | 651.110882 | 1.734367e+09 | 1.734367e+09 | 654.832892 | False |
500 | 0.951276 | 0.968748 | 0.000301 | 0.999860 | 0.943918 | 0.527787 | 0.365972 | 0.954011 | -0.149794 | -1.953354 | 502 | DONE | 654.832583 | 1.734367e+09 | 1.734367e+09 | 655.715795 | False |
501 | 0.994524 | 0.975920 | 0.000249 | 0.976017 | 0.935475 | 0.570482 | 0.846514 | 0.976254 | -0.016844 | -1.958085 | 501 | DONE | 654.832574 | 1.734367e+09 | 1.734367e+09 | 655.716204 | False |
502 | 0.918244 | 0.941863 | 0.012481 | 0.779538 | 0.933113 | 0.564239 | 0.641541 | 0.951268 | -0.221769 | -1.717381 | 500 | DONE | 654.832553 | 1.734367e+09 | 1.734367e+09 | 655.716316 | False |
503 rows × 17 columns
In this table we retrieve:
columns starting by
p:
which are the optimized variables.the
objective_{i}
are the objectives returned by the black-box function.the
job_id
is the identifier of the executed evaluations.columns starting by
m:
are metadata returned by the black-box function.pareto_efficient
is a column only returned for MOO which specify if the evaluation is part of the set of optimal solutions.
Let us use this table to visualized evaluated objectives:
[7]:
import matplotlib.pyplot as plt
plt.figure()
plt.plot(
-results[~results["pareto_efficient"]]["objective_0"],
-results[~results["pareto_efficient"]]["objective_1"],
"o",
color="blue",
alpha=0.7,
label="Non Pareto-Efficient",
)
plt.plot(
-results[results["pareto_efficient"]]["objective_0"],
-results[results["pareto_efficient"]]["objective_1"],
"o",
color="red",
alpha=0.7,
label="Pareto-Efficient",
)
plt.grid()
plt.legend()
plt.xlabel("Objective 0")
plt.ylabel("Objective 1")
plt.show()

[ ]: