deephyper.hpo.ExperimentalDesignSearch

deephyper.hpo.ExperimentalDesignSearch#

class deephyper.hpo.ExperimentalDesignSearch(problem, random_state: int = None, log_dir: str = '.', verbose: int = 0, stopper=None, checkpoint_history_to_csv: bool = True, solution_selection: Literal['argmax_obs', 'argmax_est'] | SolutionSelection | None = None, n_points: int = None, design: str = 'random', initial_points=None)[source]#

Bases: CBO

Centralized Experimental Design Search.

It follows a manager-workers architecture where the manager runs the sampling process and workers execute parallel evaluations of the black-box function.

Single-Objective	Multi-Objectives	Failures
✅	✅	✅

Example Usage:

>>> max_evals = 100
>>> search = ExperimentalDesignSearch(problem, evaluator, n_points=max_evals, design="grid")
>>> results = search.search(max_evals=100)

Parameters:

problem (HpProblem) – Hyperparameter problem describing the search space to explore.
random_state (int, optional) – Random seed. Defaults to None.
log_dir (str, optional) – Log directory where search’s results are saved. Defaults to ".".
verbose (int, optional) – Indicate the verbosity level of the search. Defaults to 0.
stopper (Stopper, optional) – a stopper to leverage multi-fidelity when evaluating the function. Defaults to None which does not use any stopper.
checkpoint_history_to_csv (bool, optional) – wether the results from progressively collected evaluations should be checkpointed regularly to disc as a csv. Defaults to True.
solution_selection (Literal["argmax_obs", "argmax_est"] | SolutionSelection, optional) – the solution selection strategy. It can be a string where "argmax_obs" would select the argmax of observed objective values, and "argmax_est" would select the argmax of estimated objective values (through a predictive model).
n_points (int, optional) – Number of points to sample. Defaults to None.
design (str, optional) – Experimental design to use, it can be one of: - "random" for uniform random numbers. - "sobol" for a Sobol’ sequence. - "halton" for a Halton sequence. - "hammersly" for a Hammersly sequence. - "lhs" for a latin hypercube sequence. - "grid" for a uniform grid sequence. Defaults to "random".
initial_points (list, optional) – List of initial points to evaluate. Defaults to None.

Methods

`ask`	Ask the search for new configurations to evaluate.
`check_evaluator`	Check if the input is a callable, an evaluator or else.
`dump_jobs_done_to_csv`	Dump jobs completed to CSV in log_dir.
`fit_generative_model`	Fits a generative model for sampling during BO.
`fit_search_space`	Apply prior-guided transfer learning based on a DataFrame of results.
`fit_surrogate`	Fit the surrogate model of the search from a checkpointed Dataframe.
`get_params`	Get parameters used for the search object.
`reload_checkpoint`
`save_params`	Save the search parameters to a JSON file in the log folder.
`search`	Execute the search algorithm.
`tell`	Tell the search the results of the evaluations.

Attributes

search_id

The identifier of the search used by the evaluator.

ask(n: int = 1) → List[Dict]#

Ask the search for new configurations to evaluate.

Parameters:: n (int, optional) – The number of configurations to ask. Defaults to 1.
Returns:: a list of hyperparameter configurations to evaluate.
Return type:: List[Dict]

check_evaluator(evaluator)#

Check if the input is a callable, an evaluator or else.

dump_jobs_done_to_csv(flush: bool = False)#

Dump jobs completed to CSV in log_dir.

Parameters:: flush (bool, optional) – Force the dumping if set to True. Defaults to False.

fit_generative_model(df: str | DataFrame, q: float = 0.9)#

Fits a generative model for sampling during BO.

Learn the distribution of hyperparameters for the top-(1-q)x100% configurations and sample from this distribution. It can be used for transfer learning. For multiobjective problems, this function computes the top-(1-q)x100% configurations in terms of their ranking with respect to pareto efficiency: all points on the first non-dominated pareto front have rank 1 and in general, points on the k’th non-dominated front have rank k.

Example Usage:

>>> search = CBO(problem)
>>> search.fit_surrogate("results.csv")
>>> search.search(evaluator, max_evals=100)

Parameters:

df (str | DataFrame) – a dataframe or path to CSV from a previous search.
q (float) – the quantile defined the set of top configurations used to bias the search. Defaults to 0.90 which select the top-10% configurations from df.

Returns:

the generative model.

Return type:

model

fit_search_space(df: str | DataFrame, fac_numerical: float = 0.125, fac_categorical: int = 10)#

Apply prior-guided transfer learning based on a DataFrame of results.

Example Usage:

>>> search = CBO(problem)
>>> search.fit_surrogate("results.csv")
>>> search.search(evaluator, max_evals=100)

Parameters:

df (str | DataFrame) – a checkpoint from a previous search.
fac_numerical (float) – the factor used to compute the sigma of a truncated normal distribution based on sigma = max(1.0, (upper - lower) * fac_numerical). A small large factor increase exploration while a small factor increase exploitation around the best-configuration from the df parameter.
fac_categorical (float) – the weight given to a categorical feature part of the best configuration. A large weight > 1 increase exploitation while a small factor close to 1 increase exploration.

fit_surrogate(df: str | DataFrame)#

Fit the surrogate model of the search from a checkpointed Dataframe.

Parameters:: df (str|DataFrame) – a checkpoint from a previous search.

Example Usage:

>>> search = CBO(problem)
>>> search.fit_surrogate("results.csv")
>>> search.search(evaluator, max_evals=100)

get_params() → dict[str, Any]#

Get parameters used for the search object.

Returns:: A dictionary of the search parameters.

save_params(filename: str = 'params.json')#

Save the search parameters to a JSON file in the log folder.

Parameters:: filename – Name of JSON file where search parameters are saved. Default is params.json.

search(evaluator, max_evals: int = -1, timeout: int | float | None = None, max_evals_strict: bool = False) → DataFrame#

Execute the search algorithm.

Parameters:

evaluator – object describing the evaluation process.
max_evals (int, optional) – The maximum number of evaluations of the run function to perform before stopping the search. Defaults to -1, will run indefinitely.
timeout (int, optional) – The time budget (in seconds) of the search before stopping. Defaults to None, will not impose a time budget.
max_evals_strict (bool, optional) – If True the search will not spawn more than max_evals jobs. Defaults to False.

Returns:

A pandas DataFrame containing the evaluations performed or None if the

search could not evaluate any configuration.

This DataFrame contains the following columns: - p:HYPERPARAMETER_NAME: for each hyperparameter of the problem. - objective: for single objective optimization. - objective_0, objective_1, …: for multi-objective optimization. - job_id: the identifier of the job. - job_status: the status of the job at the end of the search. - m:METADATA_NAME: for each metadata of the problem. Some metadata are always

present like m:timestamp_submit and m:timestamp_gather which are the timestamps of the submission and gathering of the job.

Return type:

pd.DataFrame

property search_id#: The identifier of the search used by the evaluator.

Tell the search the results of the evaluations.

Parameters:: results (list[tuple[dict[str, Optional[str | int | float]], str | int | float]]) – a dictionary containing the results of the evaluations.

deephyper.hpo.ExperimentalDesignSearch

Contents

deephyper.hpo.ExperimentalDesignSearch#