Postprocessing

This module contains functions to postprocess results.

class GPSat.postprocessing.SmoothingConfig(l_x: int | float = 1, l_y: int | float = 1, max: int | float = None, min: int | float = None)

Bases: object

Configuration used for hyperparameter smoothing.

Attributes:
l_x: int or float, default 1

The lengthscale (x-direction) parameter for Gaussian smoothing.

l_y: int or float, default 1

The lengthscale (y-direction) parameter for Gaussian smoothing.

max: int or float, optional

Maximal value that the hyperparameter can take.

min: int or float, optional

Minimal value that the hyperparameter can take.

Notes

This configuration is used to smooth 2D hyperparameter fields.

GPSat.postprocessing.smooth_hyperparameters(result_file: str, params_to_smooth: List[str], smooth_config_dict: Dict[str, SmoothingConfig], xy_dims: List[str] = ['x', 'y'], reference_table_suffix: str = '', table_suffix: str = '_SMOOTHED', output_file: str = None, model_name: str = None, save_config_file: bool = True)

Smooth hyperparameters in an HDF5 results file using Gaussian smoothing.

Parameters:
result_file: str

The path to the HDF5 results file.

params_to_smooth: list of str

A list of hyperparameters to be smoothed.

smooth_config_dict: Dict[str, SmoothingConfig]

A dictionary specifying smoothing configurations for each hyperparameter. This should be a dictionary where keys are hyperparameter names, and values are instances of the SmoothingConfig class specifying smoothing parameters.

xy_dims: list of str, default [‘x’, ‘y’]

The dimensions to use for smoothing (default: ['x', 'y']).

reference_table_suffix: str, default “”

The suffix to use for reference table names (default: "").

table_suffix: str, default “_SMOOTHED”

The suffix to add to smoothed hyperparameter table names (default: "_SMOOTHED").

output_file: str, optional

The path to the output HDF5 file to store smoothed hyperparameters.

model_name: str, optional

The name of the model for which hyperparameters are being smoothed.

save_config_file: bool, optional

Whether to save a configuration file for making predictions with smoothed values.

Returns:
None

Notes

  • This function applies Gaussian smoothing to specified hyperparameters in an HDF5 results file.

  • The output_file parameter allows you to specify a different output file for storing the smoothed hyperparameters.

  • If model_name is not provided, it will be determined from the input HDF5 file.

  • If save_config_file is True, a configuration file for making predictions with smoothed values will be saved.

GPSat.postprocessing.glue_local_predictions_1d(preds_df: DataFrame, pred_loc_col: str, xprt_loc_col: str, vars_to_glue: str | List[str], inference_radius: int | float | dict, R=3) DataFrame

Glues together overlapping local expert predictions in 1D by Gaussian-weighted averaging.

Parameters:
preds_df: pandas dataframe

A dataframe containing the results of local experts predictions. The dataframe should have columns containing the (1) prediction locations, (2) expert locations, and (3) any predicted variables we wish to glue (e.g. the predictive mean).

pred_loc_col: str

The column in the results dataframe corresponding to the prediction locations

xprt_loc_col: str

The column in the results dataframe corresponding to the local expert locations

vars_to_glue: str | list of strs

The column(s) corresponding to variables we wish to glue (e.g. the predictive mean and variance).

inference_radius: int | float | dict

The inference radius for each local experts. If specified as a dict, the keys should be the expert locations and the corresponding values should be the corresponding inference radius of that expert. If specified as an int or float, it assumes that all experts have the same inference radius.

R: int | float, default 3

A weight controlling the standard deviation of the Gaussian weights. The standard deviation will be given by the formula std = inference_radius / R. The default value of 3 will place 99% of the Gaussian mass within the inference radius.

Returns:
pandas dataframe

A dataframe of glued predictions, whose columns contain (1) the prediction locations and (2) the glued variables.

GPSat.postprocessing.glue_local_predictions_2d(preds_df: DataFrame, pred_loc_cols: List[str], xprt_loc_cols: List[str], vars_to_glue: str | List[str], inference_radius: int | float | dict, R=3) DataFrame

Glues together overlapping local expert predictions in 2D by Gaussian-weighted averaging.

Parameters:
preds_df: pandas dataframe

A dataframe containing the results of local experts predictions. The dataframe should have columns containing the (1) prediction locations, (2) expert locations, and (3) any predicted variables we wish to glue (e.g. the predictive mean).

pred_loc_col: list of strs

The xy-columns in the results dataframe corresponding to the prediction locations

xprt_loc_cols: list of strs

The xy-columns in the results dataframe corresponding to the local expert locations

vars_to_glue: str | list of strs

The column(s) corresponding to variables we wish to glue (e.g. the predictive mean and variance).

inference_radius: int | float

The inference radius for each local experts. We assume that all experts have the same inference radius.

R: int | float, default 3

A weight controlling the standard deviation of the Gaussian weights. The standard deviation will be given by the formula std = inference_radius / R. The default value of 3 will place 99% of the Gaussian mass within the inference radius.

Returns:
pandas dataframe

A dataframe of glued predictions, whose columns contain (1) the prediction locations and (2) the glued variables.