deephyper.ensemble

The ensemble module provides a way to build ensembles of checkpointed deep neural networks from tensorflow.keras, with .h5 format, to regularize and boost predictive performance as well as estimate better uncertainties.

class deephyper.ensemble.BaggingEnsembleClassifier(model_dir, loss=<function mse>, size=5, verbose=True, ray_address='', num_cpus=1, num_gpus=None, selection='topk')[source]

Bases: deephyper.ensemble._bagging_ensemble.BaggingEnsemble

Ensemble for classification based on uniform averaging of the predictions of each members.

Parameters
  • model_dir (str) – Path to directory containing saved Keras models in .h5 format.

  • loss (callable) – a callable taking (y_true, y_pred) as input.

  • size (int, optional) – Number of unique models used in the ensemble. Defaults to 5.

  • verbose (bool, optional) – Verbose mode. Defaults to True.

  • ray_address (str, optional) – Address of the Ray cluster. If “auto” it will try to connect to an existing cluster. If “” it will start a local Ray cluster. Defaults to “”.

  • num_cpus (int, optional) – Number of CPUs allocated to load one model and predict. Defaults to 1.

  • num_gpus (int, optional) – Number of GPUs allocated to load one model and predict. Defaults to None.

  • batch_size (int, optional) – Batch size used batchify the inference of loaded models. Defaults to 32.

  • selection (str, optional) – Selection strategy to build the ensemble. Value in ["topk"]. Default to topk.

evaluate(X, y, metrics=None)

Compute metrics based on the provided data.

Parameters
  • X (array) – An array of input data.

  • y (array) – An array of true output data.

  • metrics (callable, optional) – A metric. Defaults to None.

fit(X, y)

Fit the current algorithm to the provided data.

Parameters
  • X (array) – The input data.

  • y (array) – The output data.

Returns

The current fitted instance.

Return type

BaseEnsemble

load(file: str)None

Load an ensemble from a save.

Parameters

file (str) – Path to the save of the ensemble.

load_members_files(file: str = 'ensemble.json')None

Load the members composing an ensemble.

Parameters

file (str, optional) – Path of JSON file containing the ensemble members. All members needs to be accessible in model_dir. Defaults to “ensemble.json”.

predict(X)numpy.ndarray

Execute an inference of the ensemble for the provided data.

Parameters

X (array) – An array of input data.

Returns

The prediction.

Return type

array

save(file: Optional[str] = None)None

Save an ensemble.

Parameters

file (str) – Path to the save of the ensemble.

save_members_files(file: str = 'ensemble.json')None

Save the list of file names of the members of the ensemble in a JSON file.

Parameters

file (str, optional) – Path JSON file where the file names are saved. Defaults to “ensemble.json”.

class deephyper.ensemble.BaggingEnsembleRegressor(model_dir, loss=<function mse>, size=5, verbose=True, ray_address='', num_cpus=1, num_gpus=None, selection='topk')[source]

Bases: deephyper.ensemble._bagging_ensemble.BaggingEnsemble

Ensemble for regression based on uniform averaging of the predictions of each members.

Parameters
  • model_dir (str) – Path to directory containing saved Keras models in .h5 format.

  • loss (callable) – a callable taking (y_true, y_pred) as input.

  • size (int, optional) – Number of unique models used in the ensemble. Defaults to 5.

  • verbose (bool, optional) – Verbose mode. Defaults to True.

  • ray_address (str, optional) – Address of the Ray cluster. If “auto” it will try to connect to an existing cluster. If “” it will start a local Ray cluster. Defaults to “”.

  • num_cpus (int, optional) – Number of CPUs allocated to load one model and predict. Defaults to 1.

  • num_gpus (int, optional) – Number of GPUs allocated to load one model and predict. Defaults to None.

  • batch_size (int, optional) – Batch size used batchify the inference of loaded models. Defaults to 32.

  • selection (str, optional) – Selection strategy to build the ensemble. Value in ["topk"]. Default to topk.

evaluate(X, y, metrics=None)

Compute metrics based on the provided data.

Parameters
  • X (array) – An array of input data.

  • y (array) – An array of true output data.

  • metrics (callable, optional) – A metric. Defaults to None.

fit(X, y)

Fit the current algorithm to the provided data.

Parameters
  • X (array) – The input data.

  • y (array) – The output data.

Returns

The current fitted instance.

Return type

BaseEnsemble

load(file: str)None

Load an ensemble from a save.

Parameters

file (str) – Path to the save of the ensemble.

load_members_files(file: str = 'ensemble.json')None

Load the members composing an ensemble.

Parameters

file (str, optional) – Path of JSON file containing the ensemble members. All members needs to be accessible in model_dir. Defaults to “ensemble.json”.

predict(X)numpy.ndarray

Execute an inference of the ensemble for the provided data.

Parameters

X (array) – An array of input data.

Returns

The prediction.

Return type

array

save(file: Optional[str] = None)None

Save an ensemble.

Parameters

file (str) – Path to the save of the ensemble.

save_members_files(file: str = 'ensemble.json')None

Save the list of file names of the members of the ensemble in a JSON file.

Parameters

file (str, optional) – Path JSON file where the file names are saved. Defaults to “ensemble.json”.

class deephyper.ensemble.BaseEnsemble(model_dir, loss, size=5, verbose=True, ray_address='', num_cpus=1, num_gpus=None, batch_size=32)[source]

Bases: abc.ABC

Base class for ensembles, every new ensemble algorithms needs to extend this class.

Parameters
  • model_dir (str) – Path to directory containing saved Keras models in .h5 format.

  • loss (callable) – a callable taking (y_true, y_pred) as input.

  • size (int, optional) – Number of unique models used in the ensemble. Defaults to 5.

  • verbose (bool, optional) – Verbose mode. Defaults to True.

  • ray_address (str, optional) – Address of the Ray cluster. If “auto” it will try to connect to an existing cluster. If “” it will start a local Ray cluster. Defaults to “”.

  • num_cpus (int, optional) – Number of CPUs allocated to load one model and predict. Defaults to 1.

  • num_gpus (int, optional) – Number of GPUs allocated to load one model and predict. Defaults to None.

  • batch_size (int, optional) – Batch size used batchify the inference of loaded models. Defaults to 32.

abstract evaluate(X, y, metrics=None)[source]

Compute metrics based on the provided data.

Parameters
  • X (array) – An array of input data.

  • y (array) – An array of true output data.

  • metrics (callable, optional) – A metric. Defaults to None.

abstract fit(X, y)[source]

Fit the current algorithm to the provided data.

Parameters
  • X (array) – The input data.

  • y (array) – The output data.

Returns

The current fitted instance.

Return type

BaseEnsemble

load(file: str)None[source]

Load an ensemble from a save.

Parameters

file (str) – Path to the save of the ensemble.

load_members_files(file: str = 'ensemble.json')None[source]

Load the members composing an ensemble.

Parameters

file (str, optional) – Path of JSON file containing the ensemble members. All members needs to be accessible in model_dir. Defaults to “ensemble.json”.

abstract predict(X)[source]

Execute an inference of the ensemble for the provided data.

Parameters

X (array) – An array of input data.

Returns

The prediction.

Return type

array

save(file: Optional[str] = None)None[source]

Save an ensemble.

Parameters

file (str) – Path to the save of the ensemble.

save_members_files(file: str = 'ensemble.json')None[source]

Save the list of file names of the members of the ensemble in a JSON file.

Parameters

file (str, optional) – Path JSON file where the file names are saved. Defaults to “ensemble.json”.

class deephyper.ensemble.UQBaggingEnsembleClassifier(model_dir, loss=<function cce>, size=5, verbose=True, ray_address='', num_cpus=1, num_gpus=None, batch_size=32, selection='topk')[source]

Bases: deephyper.ensemble._uq_bagging_ensemble.UQBaggingEnsemble

Ensemble with uncertainty quantification for classification based on uniform averaging of the predictions of each members.

Parameters
  • model_dir (str) – Path to directory containing saved Keras models in .h5 format.

  • loss (callable) – a callable taking (y_true, y_pred) as input.

  • size (int, optional) – Number of unique models used in the ensemble. Defaults to 5.

  • verbose (bool, optional) – Verbose mode. Defaults to True.

  • ray_address (str, optional) – Address of the Ray cluster. If “auto” it will try to connect to an existing cluster. If “” it will start a local Ray cluster. Defaults to “”.

  • num_cpus (int, optional) – Number of CPUs allocated to load one model and predict. Defaults to 1.

  • num_gpus (int, optional) – Number of GPUs allocated to load one model and predict. Defaults to None.

  • batch_size (int, optional) – Batch size used batchify the inference of loaded models. Defaults to 32.

  • selection (str, optional) – Selection strategy to build the ensemble. Value in [["topk", "caruana", "friedman"]. Default to topk.

evaluate(X, y, metrics=None, scaler_y=None)

Compute metrics based on the provided data.

Parameters
  • X (array) – An array of input data.

  • y (array) – An array of true output data.

  • metrics (callable, optional) – A metric. Defaults to None.

fit(X, y)

Fit the current algorithm to the provided data.

Parameters
  • X (array) – The input data.

  • y (array) – The output data.

Returns

The current fitted instance.

Return type

BaseEnsemble

load(file: str)None

Load an ensemble from a save.

Parameters

file (str) – Path to the save of the ensemble.

load_members_files(file: str = 'ensemble.json')None

Load the members composing an ensemble.

Parameters

file (str, optional) – Path of JSON file containing the ensemble members. All members needs to be accessible in model_dir. Defaults to “ensemble.json”.

predict(X)numpy.ndarray

Execute an inference of the ensemble for the provided data.

Parameters

X (array) – An array of input data.

Returns

The prediction.

Return type

array

save(file: Optional[str] = None)None

Save an ensemble.

Parameters

file (str) – Path to the save of the ensemble.

save_members_files(file: str = 'ensemble.json')None

Save the list of file names of the members of the ensemble in a JSON file.

Parameters

file (str, optional) – Path JSON file where the file names are saved. Defaults to “ensemble.json”.

class deephyper.ensemble.UQBaggingEnsembleRegressor(model_dir, loss=<function nll>, size=5, verbose=True, ray_address='', num_cpus=1, num_gpus=None, batch_size=32, selection='topk')[source]

Bases: deephyper.ensemble._uq_bagging_ensemble.UQBaggingEnsemble

Ensemble with uncertainty quantification for regression based on uniform averaging of the predictions of each members.

Parameters
  • model_dir (str) – Path to directory containing saved Keras models in .h5 format.

  • loss (callable) – a callable taking (y_true, y_pred) as input.

  • size (int, optional) – Number of unique models used in the ensemble. Defaults to 5.

  • verbose (bool, optional) – Verbose mode. Defaults to True.

  • ray_address (str, optional) – Address of the Ray cluster. If “auto” it will try to connect to an existing cluster. If “” it will start a local Ray cluster. Defaults to “”.

  • num_cpus (int, optional) – Number of CPUs allocated to load one model and predict. Defaults to 1.

  • num_gpus (int, optional) – Number of GPUs allocated to load one model and predict. Defaults to None.

  • batch_size (int, optional) – Batch size used batchify the inference of loaded models. Defaults to 32.

  • selection (str, optional) – Selection strategy to build the ensemble. Value in [["topk", "caruana", "friedman"]. Default to topk.

evaluate(X, y, metrics=None, scaler_y=None)

Compute metrics based on the provided data.

Parameters
  • X (array) – An array of input data.

  • y (array) – An array of true output data.

  • metrics (callable, optional) – A metric. Defaults to None.

fit(X, y)

Fit the current algorithm to the provided data.

Parameters
  • X (array) – The input data.

  • y (array) – The output data.

Returns

The current fitted instance.

Return type

BaseEnsemble

load(file: str)None

Load an ensemble from a save.

Parameters

file (str) – Path to the save of the ensemble.

load_members_files(file: str = 'ensemble.json')None

Load the members composing an ensemble.

Parameters

file (str, optional) – Path of JSON file containing the ensemble members. All members needs to be accessible in model_dir. Defaults to “ensemble.json”.

predict(X)numpy.ndarray

Execute an inference of the ensemble for the provided data.

Parameters

X (array) – An array of input data.

Returns

The prediction.

Return type

array

predict_var_decomposition(X)[source]

Execute an inference of the ensemble for the provided data with uncertainty quantification estimates. The aleatoric uncertainty corresponds to the expected value of learned variance of each model composing the ensemble \(\mathbf{E}[\sigma_\theta^2(\mathbf{x})]\). The epistemic uncertainty corresponds to the variance of learned mean estimates of each model composing the ensemble \(\mathbf{V}[\mu_\theta(\mathbf{x})]\).

Parameters

X (array) – An array of input data.

Returns

where y is the mixture distribution, u1 is the aleatoric component of the variance of y and u2 is the epistemic component of the variance of y.

Return type

y, u1, u2

save(file: Optional[str] = None)None

Save an ensemble.

Parameters

file (str) – Path to the save of the ensemble.

save_members_files(file: str = 'ensemble.json')None

Save the list of file names of the members of the ensemble in a JSON file.

Parameters

file (str, optional) – Path JSON file where the file names are saved. Defaults to “ensemble.json”.