pisa.utils.hypersurface package

Submodules

pisa.utils.hypersurface.hyper_interpolator module

Classes and methods needed to do hypersurface interpolation over arbitrary parameters.

class pisa.utils.hypersurface.hyper_interpolator.HypersurfaceInterpolator(interpolation_param_spec, hs_fits, ignore_nan=True)[source]

Bases: object

Factory for interpolated hypersurfaces.

After being initialized with a set of hypersurface fits produced at different parameters, it uses interpolation to produce a Hypersurface object at a given point in parameter space using scipy’s RegularGridInterpolator.

The interpolation is piecewise-linear between points. All points must lie on a rectilinear ND grid.

Parameters:

interpolation_param_spec (dict) –

Specification of interpolation parameter grid of the form::

interpolation_param_spec = {
‘param1’: {“values”: [val1_1, val1_2, …], “scales_log”: True/False} ‘param2’: {“values”: [val2_1, val2_2, …], “scales_log”: True/False} … ‘paramN’: {“values”: [valN_1, valN_2, …], “scales_log”: True/False}

}

where values are given as Quantity.
hs_fits (list of dict) – list of dicts with hypersurfacesthat were fit at the points of the parameter mesh defined by interpolation_param_spec
ignore_nan (bool) – Ignore empty bins in hypersurfaces. The intercept in those bins is set to 1 and all slopes are set to 0.

Notes

Be sure to give a support that covers the entire relevant parameter range and a good distance beyond! To prevent minimization failure from NaNs, extrapolation is used if hypersurfaces outside the support are requested but needless to say these numbers are unreliable.

See also

scipy.interpolate.RegularGridInterpolator: class used for interpolation

property binning

get_hypersurface(**param_kw)[source]

Get a Hypersurface object with interpolated coefficients.

Parameters:: **param_kw – Parameters are given as keyword arguments, where the names of the arguments must match the names of the parameters over which the hypersurfaces are interpolated. The values are given as Quantity objects with units.

property interpolation_param_names

property num_interp_params

property param_names

plot_fits_in_bin(bin_idx, ax=None, n_steps=20, **param_kw)[source]

Plot the coefficients as well as covariance matrix elements as a function of the interpolation parameters.

Parameters:

bin_idx (tuple) – index of the bin for which to plot the fits
ax (2D array of axes, optional) – axes into which to place the plots. If None (default), appropriate axes will be generated. Must have at least size (n_coeff, n_coeff + 1).
n_steps (int, optional) – number of steps to plot between minimum and maximum
**param_kw – Parameters to be fixed when producing slices. If the interpolation is in N-D, then (N-2) parameters need to be fixed to produce 2D plots of the remaining 2 parameters and (N-1) need to be fixed to produce a 1D slice.

pisa.utils.hypersurface.hyper_interpolator.assemble_interpolated_fits(fit_directory, output_file, drop_fit_maps=False, leftout_param=None, leftout_surface=None)[source]

After all of the fits on the cluster are done, assemble the results to one JSON.

The JSON produced by this function is what load_interpolated_hypersurfaces expects.

pisa.utils.hypersurface.hyper_interpolator.get_incomplete_job_idx(fit_directory)[source]: Get job indices of fits that are not flagged as successful.

pisa.utils.hypersurface.hyper_interpolator.load_interpolated_hypersurfaces(input_file, expected_binning=None)[source]

Load a set of interpolated hypersurfaces from a file.

Analogously to “load_hypersurfaces”, this function returns a collection with a HypersurfaceInterpolator object for each Map.

Parameters:

input_file (str) –

A JSON input file as produced by fit_hypersurfaces if interpolation params were given. It has the form:

{
    interpolation_param_spec = {
        'param1': {"values": [val1_1, val1_2, ...], "scales_log": True/False}
        'param2': {"values": [val2_1, val2_2, ...], "scales_log": True/False}
        ...
        'paramN': {"values": [valN_1, valN_2, ...], "scales_log": True/False}
    },
    'hs_fits': [
        <list of dicts where keys are map names such as 'nue_cc' and values
        are hypersurface states>
    ]
}

Returns:

dictionary with a HypersurfaceInterpolator for each map

Return type:

collections.OrderedDict

pisa.utils.hypersurface.hyper_interpolator.pipeline_cfg_from_states(state_dict)[source]

Recover a pipeline cfg containing PISA objects from a raw state.

When a pipeline configuration is stored to JSON, the PISA objects turn into their serialized states. This function looks through the dictionary returned by from_json and recovers the PISA objects such as ParamSet and MultiDimBinning.

It should really become part of PISA file I/O functionality to read and write PISA objects inside dictionaries/lists into a JSON and be able to recover them…

pisa.utils.hypersurface.hyper_interpolator.prepare_interpolated_fit(nominal_dataset, sys_datasets, params, fit_directory, interpolation_param_spec, combine_regex=None, log=False, minimum_mc=0, **hypersurface_fit_kw)[source]

Writes steering files for fitting hypersurfaces on a grid of arbitrary parameters. The fits can then be run on a cluster with run_interpolated_fit.

Parameters:

nominal_dataset (dict) –
Definition of the nominal dataset. Specifies the pipleline with which the maps can be created, and the values of all systematic parameters used to produced the dataset. Format must be:

nominal_dataset = {
“pipeline_cfg” = <pipeline cfg file (either cfg file path or dict)>), “sys_params” = { param_0_name : param_0_value_in_dataset, …, param_N_name : param_N_value_in_dataset }

}

Sys params must correspond to the provided HypersurfaceParam instances provided in the params arg.
sys_datasets (list of dicts) – List of dicts, where each dict defines one of the systematics datasets to be fitted. The format of each dict is the same as explained for nominal_dataset
params (list of HypersurfaceParams) – List of HypersurfaceParams instances that define the hypersurface. Note that this defined ALL hypersurfaces fitted in this function, e.g. only supports a single parameterisation for all maps (this is almost always what you want).
output_directory (str) – Directory in which the fits will be run. Steering files for the fits to be run will be stored here.
combine_regex (list of str, or None) – List of string regex expressions that will be used for merging maps. Used to combine similar species. Must be something that can be passed to the MapSet.combine_re function (see that functions docs for more details). Choose None is do not want to perform this merging.
interpolation_param_spec (collections.OrderedDict) –
Specification of parameter grid that hypersurfaces should be interpolated over. The dict should have the following form:
```
interpolation_param_spec = {
    'param1': {"values": [val1_1, val1_2, ...], "scales_log": True/False}
    'param2': {"values": [val2_1, val2_2, ...], "scales_log": True/False}
    ...
    'paramN': {"values": [valN_1, valN_2, ...], "scales_log": True/False}
}
```
The hypersurfaces will be fit on an N-dimensional rectilinear grid over parameters 1 to N. The flag scales_log indicates that the interpolation over that parameter should happen in log-space.
minimum_mc (int, optional) – Minimum number of un-weighted MC events required in each bin.
hypersurface_fit_kw (kwargs) – kwargs will be passed on to the calls to Hypersurface.fit

pisa.utils.hypersurface.hyper_interpolator.run_interpolated_fit(fit_directory, job_idx, skip_successful=False)[source]

Run the hypersurface fit for a grid point.

If skip_successful is true, do not run if the fit_successful flag is already True.

pisa.utils.hypersurface.hyper_interpolator.serialize_pipeline_cfg(pipeline_cfg)[source]

Turn a pipeline configuration into something we can store to JSON.

It doesn’t work by default because tuples are not allowed as keys when storing to JSON. All we do is to turn the tuples into strings divided by a double underscore.

pisa.utils.hypersurface.hypersurface module

Tools for working with hypersurfaces, which are continuous functions in N-D with arbitrary functional forms.

Hypersurfaces can be used to model systematic uncertainties derived from discrete simulation datasets, for example for detedctor uncertainties.

class pisa.utils.hypersurface.hypersurface.Hypersurface(params, initial_intercept=None, log=False)[source]

Bases: object

A class defining the hypersurface

Contains :

A single common intercept
N systematic parameters, inside which the functional form is defined

This class can be configured to hold both the functional form of the hypersurface and values (likely fitted from simulation datasets) for the free parameters of this functional form.

Fitting functionality is provided to fit these free parameters.

This class can simultaneously hold hypersurfaces for every bin in a histogram (Map).

The functional form of the systematic parameters can be arbitrarily complex.

The class has a fit method for fitting the hypersurface to some data (e.g. discrete systematics sets).

Serialization functionality is included to allow fitted hypersurfaces to be stored to a file and re-loaded later (e.g. to be used in analysis).

The main use cases are:

Fit hypersurfaces
- Define the desired HypersurfaceParams (functional form, intial coefficient guesses).
- Instantiate the Hypersurface class, providing the hypersurface params and initial intercept guess.
- Use Hypersurface.fit function (or more likely the fit_hypersurfaces helper function provided below), to fit the hypersurface coefficients to some provided datasets.
- Store to file
Evaluate an existing hypersurface
- Load existing fitted Hypersurface from a file (load_hypersurfaces helper function)
- Get the resulting hypersurface value for each bin for a given set of systemaic param values using the Hypersurface.evaluate method.
- Use the hypersurface value for each bin to re-weight events

The class stores information about the datasets used to fit the hypersurfaces, including the Maps used and nominal and systematic parameter values.

Parameters:

params (list) – A list of HypersurfaceParam instances defining the hypersurface. The initial_fit_coeffts values in this instances will be used as the starting point for any fits.
initial_intercept (float) – Starting point for the hypersurface intercept in any fits
log (bool, optional) – Set hypersurface to log mode. The surface is fit to the log of the bin counts. The fitted surface is exponentiated during evaluation. Default: False

evaluate(param_values, bin_idx=None, return_uncertainty=False)[source]

Evaluate the hypersurface, using the systematic parameter values provided. Uses the current internal values for all functional form coefficients.

Parameters:

param_values (dict) –
A dict specifying the values of the systematic parameters to use in the evaluation. Format is :

{ sys_param_name_0 : sys_param_0_val, …, sys_param_name_N : sys_param_N_val }. The keys must be string and correspond to the HypersurfaceParam instances. The values must be scalars.
bin_idx (tuple or None) – Optionally can specify a particular bin (using numpy indexing). d Othewise will evaluate all bins.
return_uncertainty (bool, optional) – return the uncertainty on the output (default: False)

fit(nominal_map, nominal_param_values, sys_maps, sys_param_values, norm=True, method='L-BFGS-B', fix_intercept=False, intercept_bounds=None, intercept_sigma=None, include_empty=False, keep_maps=True, ref_bin_idx=None, smooth_method=None, smooth_kw=None)[source]

Fit the hypersurface coefficients (in every bin) to best match the provided nominal and systematic datasets.

Writes the results directly into this data structure.

Parameters:

nominal_map (Map) – Map from the nominal dataset
nominal_param_values (dict) – Value of each systematic param used to generate the nominal dataset Format: { param_0_name : param_0_nom_val, …, param_N_name : param_N_nom_val }
sys_maps (list of Maps) – List containing the Map from each systematic dataset
sys_param_values (list of dicts) – List where each element if a dict containing the values of each systematic param used to generate the that dataset Each list element specified the parameters for the corresponding element in sys_maps
norm (bool) – Normalise the maps to the nominal map. This is what you want to do when using the hypersurface to re-weight simulation (which is the main use case). In principal the hypersurfaces are more general though and could be used for other tasks too, hence this option.
method (str) – method arg to pass to scipy.optimize.minimiza
fix_intercept (bool) – Fix intercept to the initial intercept.
intercept_bounds (2-tuple, optional) – Bounds on the intercept. Default is None (no bounds)
include_empty (bool) – Include empty bins in the fit. If True, empty bins are included with value 0 and sigma 1. Default: False
keep_maps (bool) – Keep maps used to make the fit. If False, maps will be set to None after the fit is complete. This helps to reduce the size of JSONS if the Hypersurface is to be stored on disk.
ref_bin_idx (tuple) – An index specifying a reference bin that will be used for logging

property fit_coefft_labels: Return labels for each fit coefficient

property fit_coeffts: Return all coefficients, in all bins, as a single array This is the overall intercept, plus the coefficients for each individual param Dimensions are: [binning …, fit coeffts]

property fit_maps: Return the `Map instances used for fitting These will be normalised if the fit was performend to normalised maps.

property fit_param_values: Return the stored systematic parameters from the datasets used for fitting Returns: { param_0_name : [ param_0_sys_val_0, …, param_0_sys_val_M ], …, param_N_name : [ param_N_sys_val_0, …, param_N_sys_val_M ] }

fluctuate(random_state=None)[source]

Return a new hypersurface object whose coefficients have been randomly fluctuated according to the fit covariance matrix.

Used for testing the impact of statistical uncertainty in the hypersurfaces fits on downstream analyses.

classmethod from_state(state)[source]

Instantiate a new object from the contents of a serialized state dict

Parameters:: resource (dict) – A dict

pisa.utils.hypersurface.hypersurface_plotting module

Hypersurface Plotting functions

pisa.utils.hypersurface.hypersurface_plotting.plot_bin_fits(ax, hypersurface, bin_idx, param_name, color=None, label=None, hs_label=None, show_nominal=False, show_offaxis=True, show_onaxis=True, show_zero=False, show_uncertainty=True, xlim=None)[source]

Plot the hypersurface for a given bin, in 1D w.r.t. to a single specified parameter. Plots the following:

on-axis data points used in the fit

hypersurface w.r.t to the specified parameter (1D)

nominal value of the specified parameter

Parameters:

ax (matplotlib.Axes) – matplotlib ax to draw the plot on
hypersurface (Hypersurface) – Hypersurface to make the plots from
bin_idx (tuple) – Index (numpy array indexing format) of the bin to plot
param_name (str) – Name of the parameter of interest
color (str) – color to use for hypersurface curve
label (str) – label to use for hypersurface curve
show_nominal (bool) – Indicate the nominal value of the param on the plot
show_uncertainty (bool) – Indicate the hypersurface uncertainty on the plot
show_onaxis (bool) – Plot the “on-axis” input datasets (meaning those whose only off-nominal parameter is the one being plotter).
show_offaxis (bool) – Plot the “off-axis” input datasets (meaning those with multiple off-nominal parameter values).
xlim (tuple or None) – Optionally, specify the xlim to span when plotting the hypersurface If not specified, will span all input datasets

pisa.utils.hypersurface.hypersurface_plotting.plot_bin_fits_2d(ax, hypersurface, bin_idx, param_names)[source]

Plot the hypersurface for a given bin, in 2D w.r.t. to a pair of params Plots the following:

All data points used in the fit

hypersurface w.r.t to the specified parameters (2D)

nominal value of the specified parameters

Parameters:

ax (matplotlib.Axes) – matplotlib ax to draw the plot on
hypersurface (Hypersurface) – Hypersurface to make the plots from
bin_idx (tuple) – Index (numpy array indexing format) of the bin to plot
param_names (list of str) – List containing the names of the two parameters of interest

pisa.utils.hypersurface package

Submodules

pisa.utils.hypersurface.hyper_interpolator module

pisa.utils.hypersurface.hypersurface module

pisa.utils.hypersurface.hypersurface_plotting module

Module contents