Postprocessing
This module contains functions to postprocess results.
- class GPSat.postprocessing.SmoothingConfig(l_x: int | float = 1, l_y: int | float = 1, max: int | float = None, min: int | float = None)
Bases:
object
Configuration used for hyperparameter smoothing.
- Attributes:
- l_x: int or float, default 1
The lengthscale (x-direction) parameter for Gaussian smoothing.
- l_y: int or float, default 1
The lengthscale (y-direction) parameter for Gaussian smoothing.
- max: int or float, optional
Maximal value that the hyperparameter can take.
- min: int or float, optional
Minimal value that the hyperparameter can take.
Notes
This configuration is used to smooth 2D hyperparameter fields.
- GPSat.postprocessing.smooth_hyperparameters(result_file: str, params_to_smooth: List[str], smooth_config_dict: Dict[str, SmoothingConfig], xy_dims: List[str] = ['x', 'y'], reference_table_suffix: str = '', table_suffix: str = '_SMOOTHED', output_file: str = None, model_name: str = None, save_config_file: bool = True)
Smooth hyperparameters in an HDF5 results file using Gaussian smoothing.
- Parameters:
- result_file: str
The path to the HDF5 results file.
- params_to_smooth: list of str
A list of hyperparameters to be smoothed.
- smooth_config_dict: Dict[str, SmoothingConfig]
A dictionary specifying smoothing configurations for each hyperparameter. This should be a dictionary where keys are hyperparameter names, and values are instances of the
SmoothingConfig
class specifying smoothing parameters.- xy_dims: list of str, default [‘x’, ‘y’]
The dimensions to use for smoothing (default:
['x', 'y']
).- reference_table_suffix: str, default “”
The suffix to use for reference table names (default:
""
).- table_suffix: str, default “_SMOOTHED”
The suffix to add to smoothed hyperparameter table names (default:
"_SMOOTHED"
).- output_file: str, optional
The path to the output HDF5 file to store smoothed hyperparameters.
- model_name: str, optional
The name of the model for which hyperparameters are being smoothed.
- save_config_file: bool, optional
Whether to save a configuration file for making predictions with smoothed values.
- Returns:
- None
Notes
This function applies Gaussian smoothing to specified hyperparameters in an HDF5 results file.
The
output_file
parameter allows you to specify a different output file for storing the smoothed hyperparameters.If
model_name
is not provided, it will be determined from the input HDF5 file.If
save_config_file
isTrue
, a configuration file for making predictions with smoothed values will be saved.
- GPSat.postprocessing.glue_local_predictions_1d(preds_df: DataFrame, pred_loc_col: str, xprt_loc_col: str, vars_to_glue: str | List[str], inference_radius: int | float | dict, R=3) DataFrame
Glues together overlapping local expert predictions in 1D by Gaussian-weighted averaging.
- Parameters:
- preds_df: pandas dataframe
A dataframe containing the results of local experts predictions. The dataframe should have columns containing the (1) prediction locations, (2) expert locations, and (3) any predicted variables we wish to glue (e.g. the predictive mean).
- pred_loc_col: str
The column in the results dataframe corresponding to the prediction locations
- xprt_loc_col: str
The column in the results dataframe corresponding to the local expert locations
- vars_to_glue: str | list of strs
The column(s) corresponding to variables we wish to glue (e.g. the predictive mean and variance).
- inference_radius: int | float | dict
The inference radius for each local experts. If specified as a dict, the keys should be the expert locations and the corresponding values should be the corresponding inference radius of that expert. If specified as an int or float, it assumes that all experts have the same inference radius.
- R: int | float, default 3
A weight controlling the standard deviation of the Gaussian weights. The standard deviation will be given by the formula
std = inference_radius / R
. The default value of 3 will place 99% of the Gaussian mass within the inference radius.
- Returns:
- pandas dataframe
A dataframe of glued predictions, whose columns contain (1) the prediction locations and (2) the glued variables.
- GPSat.postprocessing.glue_local_predictions_2d(preds_df: DataFrame, pred_loc_cols: List[str], xprt_loc_cols: List[str], vars_to_glue: str | List[str], inference_radius: int | float | dict, R=3) DataFrame
Glues together overlapping local expert predictions in 2D by Gaussian-weighted averaging.
- Parameters:
- preds_df: pandas dataframe
A dataframe containing the results of local experts predictions. The dataframe should have columns containing the (1) prediction locations, (2) expert locations, and (3) any predicted variables we wish to glue (e.g. the predictive mean).
- pred_loc_col: list of strs
The xy-columns in the results dataframe corresponding to the prediction locations
- xprt_loc_cols: list of strs
The xy-columns in the results dataframe corresponding to the local expert locations
- vars_to_glue: str | list of strs
The column(s) corresponding to variables we wish to glue (e.g. the predictive mean and variance).
- inference_radius: int | float
The inference radius for each local experts. We assume that all experts have the same inference radius.
- R: int | float, default 3
A weight controlling the standard deviation of the Gaussian weights. The standard deviation will be given by the formula
std = inference_radius / R
. The default value of 3 will place 99% of the Gaussian mass within the inference radius.
- Returns:
- pandas dataframe
A dataframe of glued predictions, whose columns contain (1) the prediction locations and (2) the glued variables.