deephyper.skopt.learning.gaussian_process.GaussianProcessRegressor#

class deephyper.skopt.learning.gaussian_process.GaussianProcessRegressor(*args: Any, **kwargs: Any)[source]#

Bases: GaussianProcessRegressor

GaussianProcessRegressor that allows noise tunability.

This implementation is based on Algorithm 2.1 of Gaussian Processes for Machine Learning (GPML) by Rasmussen and Williams.

In addition to the standard scikit-learn estimator API, GaussianProcessRegressor:

  • Allows prediction without prior fitting (based on the GP prior).

  • Provides an additional method sample_y(X) to evaluate samples drawn

from the GPR (prior or posterior) at given inputs. * Exposes a method log_marginal_likelihood(theta) that can be used externally for alternative hyperparameter selection strategies, such as Markov chain Monte Carlo.

Parameters:
  • kernel (Kernel) – The kernel specifying the covariance function of the GP. If None, the default kernel 1.0 * RBF(1.0) is used. Kernel hyperparameters are optimized during fitting.

  • alpha (float or array-like, optional) – Value added to the diagonal of the kernel matrix during fitting (default: 1e-10). Larger values correspond to higher assumed noise levels and improve numerical stability. If an array is provided, it must match the number of training samples and is interpreted as a per-sample noise level. Equivalent to adding a WhiteKernel with c = alpha. Provided mainly for convenience and consistency with Ridge regression.

  • optimizer (str or callable, optional) –

    Optimizer for kernel parameter tuning (default: “fmin_l_bfgs_b”). A string selects one of the built-in optimizers. A callable must follow the signature:

    def optimizer(obj_func, initial_theta, bounds):
        # obj_func: objective to maximize; accepts theta and
        #           optionally eval_gradient
        # initial_theta: initial hyperparameter state
        # bounds: box constraints for theta
        return theta_opt, func_min
    

    If None, kernel parameters are kept fixed.

    Available built-in optimizers:
    • ”fmin_l_bfgs_b”

  • n_restarts_optimizer (int, optional) – Number of optimizer restarts to maximize the log-marginal likelihood (default: 0). The first run uses the kernel’s initial parameters; additional runs start from random log-uniform samples. If > 0, all parameter bounds must be finite. A value of 0 implies a single run.

  • normalize_y (bool, optional) – Whether to normalize target values to have zero mean (default: False). Should be enabled when the target mean deviates significantly from zero. Note: this effectively alters the GP prior using the data, which contradicts the likelihood principle; thus the default is False.

  • copy_X_train (bool, optional) – If True (default), a persistent copy of the training data is stored. If False, only a reference is kept, so external modification of the data may affect predictions.

  • random_state (int or numpy.random.RandomState, optional) – Random generator used for initialization. If an integer is provided, it sets the seed. Defaults to NumPy’s global RNG.

  • noise (str, optional) – If set to “gaussian”, the model assumes that y is a noisy estimate of the latent function f(x) with Gaussian noise.

X_train_#

Training feature values.

Type:

array-like of shape (n_samples, n_features)

y_train_#

Training target values.

Type:

array-like of shape (n_samples, [n_output_dims])

kernel_#

The kernel used for prediction, identical in structure to the input kernel but with optimized hyperparameters.

Type:

Kernel

L_#

Lower-triangular Cholesky decomposition of the kernel matrix evaluated at X_train_.

Type:

array-like of shape (n_samples, n_samples)

alpha_#

Dual coefficients of training data points in kernel space.

Type:

array-like of shape (n_samples,)

log_marginal_likelihood_value_#

Log-marginal likelihood evaluated at self.kernel_.theta.

Type:

float

noise_#

Estimated Gaussian noise level. Only relevant when noise="gaussian".

Type:

float

Methods

fit

Fit Gaussian process regression model.

predict

Predict output for X.

__call__(*args: Any, **kwargs: Any) Any#

Call self as a function.

fit(X, y)[source]#

Fit Gaussian process regression model.

Args: X : array-like, shape = (n_samples, n_features)

Training data

yarray-like, shape = (n_samples, [n_output_dims])

Target values

Returns: self

Returns an instance of self.

predict(X, return_std=False, return_cov=False, return_mean_grad=False, return_std_grad=False)[source]#

Predict output for X.

In addition to the mean of the predictive distribution, also its standard deviation (return_std=True) or covariance (return_cov=True), the gradient of the mean and the standard-deviation with respect to X can be optionally provided.

Args: X : array-like, shape = (n_samples, n_features)

Query points where the GP is evaluated.

return_stdbool, default: False

If True, the standard-deviation of the predictive distribution at the query points is returned along with the mean.

return_covbool, default: False

If True, the covariance of the joint predictive distribution at the query points is returned along with the mean.

return_mean_gradbool, default: False

Whether or not to return the gradient of the mean. Only valid when X is a single point.

return_std_gradbool, default: False

Whether or not to return the gradient of the std. Only valid when X is a single point.

Returns: y_mean : array, shape = (n_samples, [n_output_dims])

Mean of predictive distribution a query points

y_stdarray, shape = (n_samples,), optional

Standard deviation of predictive distribution at query points. Only returned when return_std is True.

y_covarray, shape = (n_samples, n_samples), optional

Covariance of joint predictive distribution a query points. Only returned when return_cov is True.

y_mean_gradshape = (n_samples, n_features)

The gradient of the predicted mean

y_std_gradshape = (n_samples, n_features)

The gradient of the predicted std.