deephyper.ensemble.selector.GreedySelector

deephyper.ensemble.selector.GreedySelector#

class deephyper.ensemble.selector.GreedySelector(loss_func: Callable, aggregator: Aggregator, k: int = 5, k_init: int = 5, max_it: int = -1, eps_tol: float = 0.001, with_replacement: bool = True, early_stopping: bool = True, bagging: bool = False, random_state=None, verbose: bool = False)[source]#

Bases: Selector

Selection method implementing Greedy (a.k.a., Caruana) selection. This method iteratively and greedily selects the predictors that minimize the loss when aggregated together.

Parameters:
  • loss_func (Callable or Loss) – a loss function that takes two arguments: the true target values and the predicted target values.

  • aggregator (Aggregator) – The aggregator to use to combine the predictions of the selected predictors.

  • k (int, optional) – The number of unique predictors to select for the ensemble. Defaults to 5.

  • k_init (int, optional) – Regularization parameter for greedy selection. It is the number of predictors to select in the initialization step. Defaults to 1.

  • max_it (int, optional) – Maximum number of iterations which also corresponds to the number of non-unique predictors added to the ensemble. Defaults to -1.

  • eps_tol (float, optional) – Tolerance for the stopping criterion. Defaults to 1e-3.

  • with_replacement (bool, optional) – Performs greedy selection with replacement of models already selected. Defaults to True.

  • early_stopping (bool, optional) – Stops the ensemble selection as soon as the loss stops improving. Defaults to True.

  • bagging (bool, optional) – Performanced boostrap resampling of available predictors at each iteration. This can be particularly useful when the dataset used for selection is small. Defaults to False.

  • verbose (bool, optional) – Turns on the verbose mode. Defaults to False.

Methods

select

The selection algorithms.

select(y, y_predictors) Sequence[int][source]#

The selection algorithms.

Parameters:
  • y (np.ndarray) – the true target values.

  • y_predictors (_type_) – a sequence of predictions from available predictors. It should be a list of length n_predictors with each element being the prediction of a predictor.

Returns:

the sequence of selected predictors. Sequence[float]: the sequence of weights associated to the selected predictors.

Return type:

Sequence[int]