deephyper.search.hps.DBO
deephyper.search.hps.DBO#
-
class
deephyper.search.hps.
DBO
(problem, run_function, random_state: Optional[int] = None, log_dir: str = '.', verbose: int = 0, comm=None, run_function_kwargs: Optional[dict] = None, n_jobs: int = 1, surrogate_model: str = 'RF', surrogate_model_kwargs: Optional[dict] = None, n_initial_points: int = 10, lazy_socket_allocation: bool = False, communication_batch_size=2048, sync_communication: bool = False, sync_communication_freq: int = 10, checkpoint_file: str = 'results.csv', checkpoint_freq: int = 1, acq_func: str = 'UCB', acq_optimizer: str = 'auto', kappa: float = 1.96, xi: float = 0.001, sample_max_size: int = - 1, sample_strategy: str = 'quantile')[source]# Bases:
object
Distributed Bayesian Optimization Search.
- Parameters
problem (HpProblem) – Hyperparameter problem describing the search space to explore.
run_function (callable) – A callable instance which represents the black-box function we want to evaluate.
random_state (int, optional) – Random seed. Defaults to
None
.log_dir (str, optional) – Log directory where search’s results are saved. Defaults to
"."
.verbose (int, optional) – Indicate the verbosity level of the search. Defaults to
0
.comm (optional) – The MPI communicator to use. Defaults to
None
.run_function_kwargs (dict) – Keyword arguments to pass to the run-function. Defaults to
None
.n_jobs (int, optional) – Parallel processes per rank to use for optimization updates (e.g., model re-fitting). Defaults to
1
.surrogate_model (str, optional) – Type of the surrogate model to use.
"DUMMY"
can be used of random-search,"GP"
for Gaussian-Process (efficient with few iterations such as a hundred sequentially but bottleneck when scaling because of its cubic complexity w.r.t. the number of evaluations), “``”RF”`` for the Random-Forest regressor (log-linear complexity with respect to the number of evaluations). Defaults to"RF"
.lazy_socket_allocation (bool, optional) – If True then MPI communication socket are initialized only when used for the first time, otherwise the initialization is forced when creating the instance. Defaults to
False
.sync_communication (bool, optional) – If True workers communicate synchronously, otherwise workers communicate asynchronously. Defaults to
False
.sync_communication_freq (int, optional) – Manage the frequency at which workers should communicate their results in the case of synchronous communication. Defaults to
10
.checkpoint_file (str) – Name of the file in
log_dir
where results are checkpointed. Defaults to"results.csv"
.checkpoint_freq (int) – Frequency at which results are checkpointed. Defaults to
1
.acq_func (str) – Acquisition function to use. If
"UCB"
then the upper confidence bound is used, if"EI"
then the expected-improvement is used, if"PI"
then the probability of improvement is used, if"gp_hedge"
then probabilistically choose one of the above.acq_optimizer (str) – Method use to optimise the acquisition function. If
"sampling"
then random-samples are drawn and infered for optimization, if"lbfgs"
gradient-descent is used. Defaults to"auto"
.kappa (float) – Exploration/exploitation value for UCB-acquisition function, the higher the more exploration, the smaller the more exploitation. Defaults to
1.96
which corresponds to a 95% confidence interval.xi (float) – Exploration/exploitation value for EI and PI-acquisition functions, the higher the more exploration, the smaller the more exploitation. Defaults to
0.001
.sample_max_size (int) – Maximum size of the number of samples used to re-fit the surrogate model. Defaults to
-1
for infinite sample size.sample_strategy (str) – Sub-sampling strategy to re-fit the surrogate model. If
"quantile"
then sub-sampling is performed based on the quantile of the collected objective values. Defaults to"quantile"
.
Methods
broadcast
broadcast_to_root
Dump evaluations to a CSV file.``
Dumps the context in the log folder.
Fit the surrogate model of the search from a checkpointed Dataframe.
gather_results
recv_any
Execute the search algorithm.
send_all
Terminate the search.
to_dict
Transform a list of hyperparameter values to a
dict
where keys are hyperparameters names and values are hyperparameters values.Returns a json version of the search object.
-
fit_surrogate
(df)[source]# Fit the surrogate model of the search from a checkpointed Dataframe.
- Parameters
df (str|DataFrame) – a checkpoint from a previous search.
Example Usage:
>>> search = CBO(problem, evaluator) >>> search.fit_surrogate("results.csv")
-
search
(max_evals: int = - 1, timeout: Optional[int] = None)[source]# Execute the search algorithm.
- Parameters
- Returns
a pandas DataFrame containing the evaluations performed.
- Return type
DataFrame
-
terminate
()[source]# Terminate the search.
- Raises
SearchTerminationError – raised when the search is terminated with SIGALARM