Pure python models

GPSat model based on a pure python implementation of GP regression. Based on this code to replicate the optimal interpolation algorithm in [G’21].

[G’21] Gregory, William, Isobel R. Lawrence, and Michel Tsamados. “A Bayesian approach towards daily pan-Arctic sea ice freeboard estimates from combined CryoSat-2 and Sentinel-3 satellite observations.” The Cryosphere, 2021.

GPSat.models.pure_python_gpr.GPR(x, y, xs, ell, sf2, sn2, mean, approx=False, M=None, returnprior=False)

Gaussian process regression function to predict radar freeboard Parameters ———-

x: training data of size n x 3 (3 corresponds to x,y,time) y: training outputs of size n x 1 (observations of radar freeboard) xs: test inputs of size ns x 3 ell: correlation length-scales of the covariance function (vector of length 3) sf2: scaling pre-factor for covariance function (scalar) sn2: noise variance (scalar) mean: prior mean (scalar) approx: Boolean, whether to use Nyström approximation method M: number of training points to use in Nyström approx (integer scalar)

Returns

fs: predictive mean sfs2: predictive variance np.sqrt(Kxs[0][0]): prior variance

GPSat.models.pure_python_gpr.Nystroem(x, y, M, ell, sf2, sn2, seed=20, opt=False)

Nyström approximation for kernel machines, e.g., Williams and Seeger, 2001. Produce a rank ‘M’ approximation of K and find its inverse via Woodbury identity. This is a faster approach of making predictions, but performance will depend on the value of M.

class GPSat.models.pure_python_gpr.PurePythonGPR(data=None, coords_col=None, obs_col=None, coords=None, obs=None, coords_scale=None, obs_scale=None, obs_mean=None, *, length_scales=1.0, kernel_var=1.0, likeli_var=1.0, kernel='Matern32', constraints_dict=None, **kwargs)

Bases: BaseGPRModel

Pure Python GPR class - used to hold model details from pure python implementation

SGPkernel(**kwargs)
SMLII(hypers, x, y, approx=False, M=None, grad=True)
get_kernel_variance()
get_lengthscales()
get_likelihood_variance()
get_loglikelihood()
get_objective_function_value()

Get value of objection function used to train the model. e.g. the log marginal likelihood when using exact GPR. Any inheriting class should override this method.

get_transform_funcs(func, **kwargs)
optimise(opt_method='L-BFGS-B', jac=False)
optimise_parameters(opt_method='L-BFGS-B', jac=False)

an inheriting class should define method for optimising (hyper/variational) parameters

property param_names: list

Property method that returns the names of parameters in a list. Any inheriting class should override this method.

Each parameter name should have a get_* and set_* method. e.g. if param_names = ['A', 'B'] then methods get_A, set_A, get_B, set_B should be defined.

Additionally, one can specify a set_*_constraints method that imposes constraints on the parameters during training, if applicable.

predict(coords, mean=0, apply_scale=True)

Method to generate prediction at given coords. Any inheriting class should override this method.

Parameters:
coords: numpy array

Coordinate values where we wish to make predictions.

Returns:
dict

Predictions at the given coordinate locations. Should be a dictionary containing the mean and variance of the predictions, as well as other variables one wishes to save.

set_kernel_variance(kernel_variance)
set_kernel_variance_constraints(func=None, move_within_tol=True, tol=0.01, **kwargs)
set_lengthscales(lengthscales)
set_lengthscales_constraints(func=None, move_within_tol=True, tol=0.01, scale=True, **kwargs)
set_likelihood_variance(likelihood_variance)
set_likelihood_variance_constraints(func=None, move_within_tol=True, tol=0.01, **kwargs)
GPSat.models.pure_python_gpr.SGPkernel(x, xs=None, grad=False, ell=1, sigma=1)

Return a Matern (3/2) covariance function for the given inputs. Inputs:

x: training data of size n x 3 (3 corresponds to x,y,time) xs: test inputs of size ns x 3 grad: Boolean whether to return the gradients of the covariance

function

ell: correlation length-scales of the covariance function sigma: scaling pre-factor for covariance function

Returns:

sigma*k: scaled covariance function sigma*dk: scaled matrix of gradients

GPSat.models.pure_python_gpr.SMLII_mod(hypers, x, y, approx=False, M=None, grad=True, use_log=True)

Objective function to minimise when optimising the model hyperparameters. This function is the negative log marginal likelihood. Inputs:

hypers: initial guess of hyperparameters x: inputs (vector of size n x 3) y: outputs (freeboard values from all satellites, size n x 1) approx: Boolean, whether to use Nyström approximation method M: number of training points to use in Nyström approx (integer scalar)

Returns:

nlZ: negative log marginal likelihood dnLZ: gradients of the negative log marginal likelihood