deephyper.hpo.ExperimentalDesignSearch#
- class deephyper.hpo.ExperimentalDesignSearch(problem, evaluator, random_state: int = None, log_dir: str = '.', verbose: int = 0, stopper=None, checkpoint_history_to_csv: bool = True, solution_selection: Literal['argmax_obs', 'argmax_est'] | SolutionSelection | None = None, n_points: int = None, design: str = 'random', initial_points=None)[source]#
Bases:
CBO
Centralized Experimental Design Search.
It follows a manager-workers architecture where the manager runs the sampling process and workers execute parallel evaluations of the black-box function.
Single-Objective
Multi-Objectives
Failures
✅
✅
✅
Example Usage:
>>> max_evals = 100 >>> search = ExperimentalDesignSearch(problem, evaluator, n_points=max_evals, design="grid") >>> results = search.search(max_evals=100)
- Parameters:
problem (HpProblem) – Hyperparameter problem describing the search space to explore.
evaluator (Evaluator) – An
Evaluator
instance responsible of distributing the tasks.random_state (int, optional) – Random seed. Defaults to
None
.log_dir (str, optional) – Log directory where search’s results are saved. Defaults to
"."
.verbose (int, optional) – Indicate the verbosity level of the search. Defaults to
0
.stopper (Stopper, optional) – a stopper to leverage multi-fidelity when evaluating the function. Defaults to
None
which does not use any stopper.checkpoint_history_to_csv (bool, optional) – wether the results from progressively collected evaluations should be checkpointed regularly to disc as a csv. Defaults to
True
.solution_selection (Literal["argmax_obs", "argmax_est"] | SolutionSelection, optional) – the solution selection strategy. It can be a string where
"argmax_obs"
would select the argmax of observed objective values, and"argmax_est"
would select the argmax of estimated objective values (through a predictive model).n_points (int, optional) – Number of points to sample. Defaults to
None
.design (str, optional) – Experimental design to use, it can be one of: -
"random"
for uniform random numbers. -"sobol"
for a Sobol’ sequence. -"halton"
for a Halton sequence. -"hammersly"
for a Hammersly sequence. -"lhs"
for a latin hypercube sequence. -"grid"
for a uniform grid sequence. Defaults to"random"
.initial_points (list, optional) – List of initial points to evaluate. Defaults to
None
.
Methods
Ask the search for new configurations to evaluate.
check_evaluator
Dumps the context in the log folder.
Dump jobs completed to CSV in log_dir.
Fits a generative model for sampling during BO.
Apply prior-guided transfer learning based on a DataFrame of results.
Fit the surrogate model of the search from a checkpointed Dataframe.
Execute the search algorithm.
Tell the search the results of the evaluations.
Returns a json version of the search object.
Attributes
The identifier of the search used by the evaluator.
- ask(n: int = 1) List[Dict] #
Ask the search for new configurations to evaluate.
- Parameters:
n (int, optional) – The number of configurations to ask. Defaults to 1.
- Returns:
a list of hyperparameter configurations to evaluate.
- Return type:
List[Dict]
- dump_context()#
Dumps the context in the log folder.
- dump_jobs_done_to_csv(flush: bool = False)#
Dump jobs completed to CSV in log_dir.
- Parameters:
flush (bool, optional) – Force the dumping if set to
True
. Defaults toFalse
.
- fit_generative_model(df, q=0.9, verbose=False)#
Fits a generative model for sampling during BO.
Learn the distribution of hyperparameters for the top-
(1-q)x100%
configurations and sample from this distribution. It can be used for transfer learning. For multiobjective problems, this function computes the top-(1-q)x100%
configurations in terms of their ranking with respect to pareto efficiency: all points on the first non-dominated pareto front have rank 1 and in general, points on the k’th non-dominated front have rank k.Example Usage:
>>> search = CBO(problem, evaluator) >>> search.fit_surrogate("results.csv")
- Parameters:
df (str|DataFrame) – a dataframe or path to CSV from a previous search.
q (float, optional) – the quantile defined the set of top configurations used to bias the search. Defaults to
0.90
which select the top-10% configurations fromdf
.verbose (bool, optional) – If set to
True
it will print the score of the generative model. Defaults toFalse
.
- Returns:
the generative model.
- Return type:
model
- fit_search_space(df, fac_numerical=0.125, fac_categorical=10)#
Apply prior-guided transfer learning based on a DataFrame of results.
Example Usage:
>>> search = CBO(problem, evaluator) >>> search.fit_surrogate("results.csv")
- Parameters:
df (str|DataFrame) – a checkpoint from a previous search.
fac_numerical (float) – the factor used to compute the sigma of a truncated normal distribution based on
sigma = max(1.0, (upper - lower) * fac_numerical)
. A small large factor increase exploration while a small factor increase exploitation around the best-configuration from thedf
parameter.fac_categorical (float) – the weight given to a categorical feature part of the best configuration. A large weight
> 1
increase exploitation while a small factor close to1
increase exploration.
- fit_surrogate(df)#
Fit the surrogate model of the search from a checkpointed Dataframe.
- Parameters:
df (str|DataFrame) – a checkpoint from a previous search.
Example Usage:
>>> search = CBO(problem, evaluator) >>> search.fit_surrogate("results.csv")
- search(max_evals: int = -1, timeout: int | float | None = None, max_evals_strict: bool = False) DataFrame #
Execute the search algorithm.
- Parameters:
max_evals (int, optional) – The maximum number of evaluations of the run function to perform before stopping the search. Defaults to
-1
, will run indefinitely.timeout (int, optional) – The time budget (in seconds) of the search before stopping. Defaults to
None
, will not impose a time budget.max_evals_strict (bool, optional) – If
True
the search will not spawn more thanmax_evals
jobs. Defaults toFalse
.
- Returns:
- A pandas DataFrame containing the evaluations performed or
None
if the search could not evaluate any configuration.
This DataFrame contains the following columns: -
p:HYPERPARAMETER_NAME
: for each hyperparameter of the problem. -objective
: for single objective optimization. -objective_0
,objective_1
, …: for multi-objective optimization. -job_id
: the identifier of the job. -job_status
: the status of the job at the end of the search. -m:METADATA_NAME
: for each metadata of the problem. Some metadata are alwayspresent like
m:timestamp_submit
andm:timestamp_gather
which are the timestamps of the submission and gathering of the job.
- A pandas DataFrame containing the evaluations performed or
- Return type:
pd.DataFrame
- property search_id#
The identifier of the search used by the evaluator.
- tell(results: List[HPOJob])#
Tell the search the results of the evaluations.
- Parameters:
results (List[HPOJob]) – a list of HPOJobs from which hyperparameters and objectives can
retrieved. (be)
- to_json()#
Returns a json version of the search object.