pisa.stages.utils package

Submodules

pisa.stages.utils.add_indices module

PISA module to prep incoming data into formats that are compatible with the mc_uncertainty likelihood formulation

This module takes in events containers from the pipeline, and introduces an additional array giving the indices where each event falls into.

module structure imported from bootcamp example

class pisa.stages.utils.add_indices.add_indices(**std_kwargs)[source]

Bases: Stage

PISA Pi stage to map out the index of the analysis binning where each event falls into.

Parameters:

params – foo : Quantity bar : Quanitiy with time dimension
Notes
------
module (- input and calc specs are predetermined in the) – (inputs from the config files will be disregarded)
bin_indices (- stage appends an array quantity called)
by (- stage also appends an array mask to access events) – bin index later in the pipeline

setup_function()[source]

Calculate the bin index where each event falls into

Create one mask for each analysis bin.

pisa.stages.utils.adhoc_sys module

Stage to implement an ad-hoc systematic that corrects the discrepancy between data and MC in one particular variable. This can be used to check how large the impact of such a hypothetical systematic would be on the physics parameters of an analysis.

class pisa.stages.utils.adhoc_sys.adhoc_sys(data=None, params=None, variable_name=None, scale_file=None, **std_kwargs)[source]

Bases: Stage

Stage to re-weight events according to factors derived from post-fit data/MC comparisons. The comparisons are produced somewhere externally and stored as a JSON which encodes the binning that was used to make the comparison and the resulting scaling factors.

Parameters:

variable_name (str) – Name of the variable to correct data/MC agreement for. The variable must be loaded in the data loading stage and it must be present in the loaded JSON file.
scale_file (str) – Path to the file which contains the binning and the scale factors. The JSON file must contain a dictionary in which, for each variable, a 1D binning and an array of factors. This file is produced externally from PISA.

apply_function()[source]: Implement in services (subclasses of Stage)

setup_function()[source]: Implement in services (subclasses of Stage)

pisa.stages.utils.adhoc_sys.init_test(**param_kwargs)[source]: Instantiation example

pisa.stages.utils.bootstrap module

Make bootstrap samples of data.

This stage allows one to resample datasets to estimate MC uncertainties without having to decrease statistics. Bootstrap samples are produced by random selection with replacement, which is implemented in this stage by an equivalent re-weighting of events.

class pisa.stages.utils.bootstrap.bootstrap(seed=None, **std_kwargs)[source]

Bases: Stage

Stage to make bootstrap samples from input data.

Parameters:: seed (int, optional) – Seed for the random number generator.

apply_function()[source]: Implement in services (subclasses of Stage)

setup_function()[source]: Implement in services (subclasses of Stage)

pisa.stages.utils.bootstrap.init_test(**param_kwargs)[source]: Instantiation example

pisa.stages.utils.bootstrap.insert_bootstrap_after_data_loader(cfg_dict, seed=None)[source]

Given a pipeline configuration parsed with parse_pipeline_config, insert the bootstrap stage directly after the simple_data_loader stage and return the modified config dict.

Parameters:

cfg_dict (collections.OrderedDict) – Pipeline configuration in the form of an ordered dictionary.
seed (int, optional) – Seed to be placed into the pipeline configuration.

Returns:

A deepcopy of the original input cfg_dict with the configuration of the bootstrap stage inserted after the data loader.

Return type:

collections.OrderedDict

pisa.stages.utils.bootstrap.test_bootstrap()[source]: Unit test for the bootstrap stage.

pisa.stages.utils.fix_error module

Stage to take the initial errors of MC and keep them for all minimization.

Needed for the DRAGON nutau appearance analysis.

class pisa.stages.utils.fix_error.fix_error(**std_kwargs)[source]

Bases: Stage

stage to fix the error returned by template_maker.

apply_function()[source]: Implement in services (subclasses of Stage)

compute_function()[source]: Implement in services (subclasses of Stage)

setup_function()[source]: Implement in services (subclasses of Stage)

pisa.stages.utils.hist module

Stage to transform arrays with weights into actual histograms that represent event counts

class pisa.stages.utils.hist.hist(apply_unc_weights=False, unweighted=False, **std_kwargs)[source]

Bases: Stage

Stage to histogram events

Parameters:

unweighted (bool, default False) – Return un-weighted event counts in each bin
apply_unc_weights (bool, default False) –
Expected container keys are ..
```
"weights"
"unc_weights" (if `apply_unc_weights`)
```

apply_function()[source]: Implement in services (subclasses of Stage)

setup_function()[source]: Implement in services (subclasses of Stage)

pisa.stages.utils.hist.init_test(**param_kwargs)[source]: Instantiation example

pisa.stages.utils.kde module

Stage to transform arrays with weights into KDE maps that represent event counts

pisa.stages.utils.kde.init_test(**param_kwargs)[source]: Initialisation example

class pisa.stages.utils.kde.kde(bw_method='silverman', coszen_name='reco_coszen', oversample=10, coszen_reflection=0.25, alpha=0.1, stack_pid=True, stash_hists=False, bootstrap=False, bootstrap_niter=10, bootstrap_seed=None, linearize_log_dims=True, **std_kargs)[source]

Bases: Stage

stage to KDE-map events

Parameters:

bw_method (string) – ‘scott’ or ‘silverman’ (see kde module)
coszen_name (string) – Binning name to identify the coszen bin that needs to undergo special treatment for reflection
oversample (int) – Evaluate KDE at more points per bin, takes longer, but is more accurate
stash_hists (bool) – Evaluate KDE only once and stash the result. This effectively ignores all changes from earlier stages, but greatly increases speed. Useful for muons where only over-all weight and detector systematic variations matter, which can both be applied on the histograms after this stage.
bootstrap (bool) – Use the bootstrapping technique to estimate errors on the KDE histograms.
linearize_log_dims (bool) – If True (default), calculate the KDE for a dimension that is binned logarithmically on the logarithm of the sample values. This generally results in better agreement of the total normalization of the KDE’d histograms to the sum of weights.

Notes

Make sure enough events are present with reco energy below and above the binning range, otherwise events will only “bleed out”

apply()[source]: This is special, we want the actual event weights in the kde therefor we’re overwritting the apply function normally in a stage you would implement the apply_function method and not the apply method! We also have to reimplement the profiling functionality in apply of the Base class

apply_function()[source]: Implement in services (subclasses of Stage)

setup_function()[source]: Implement in services (subclasses of Stage)

pisa.stages.utils.kfold module

Make K-folds of data.

This stage can be used to split MC into chunks of equal size and to select only one chunk to make histograms from. It uses the KFold class from scikit-learn to make “test” and “train” indeces for the dataset and sets all weights in the “train” indeces to zero. Optionally, weights can be re-scaled by the number of splits to renormalize the total rates.

pisa.stages.utils.kfold.init_test(**param_kwargs)[source]: Initialisation example

class pisa.stages.utils.kfold.kfold(n_splits, select_split=0, seed=None, renormalize=False, shuffle=False, save_mask=False, **std_kwargs)[source]

Bases: Stage

Stage to make splits of the MC set and select one split to make histograms. The weight of all indeces not belonging to the selected split are set to zero.

Parameters:

(int) (n_splits)
(int (seed)
optional) (shuffle indeces before splitting)
(int
optional)
(bool (shuffle) – by the number of splits
optional) – by the number of splits
(bool
optional)

apply_function()[source]: Implement in services (subclasses of Stage)

setup_function()[source]: Implement in services (subclasses of Stage)

pisa.stages.utils.resample module

Stage to transform binned data from one binning to another while also dealing with uncertainty estimates in a reasonable way. In particular, this allows up-sampling from a more coarse binning to a finer binning.

The implementation is similar to that of the hist stage, hence the over-writing of the apply method.

class pisa.stages.utils.resample.ResampleMode(value)[source]

Bases: Enum

Enumerates sampling methods of the resample stage.

ARB = 3

DOWN = 2

UP = 1

pisa.stages.utils.resample.init_test(**param_kwargs)[source]: Initialisation example

class pisa.stages.utils.resample.resample(scale_errors=True, **std_kwargs)[source]

Bases: Stage

Stage to resample weighted MC histograms from one binning to another.

The origin binning is given as calc_mode and the output binning is given in apply_mode.

Parameters:: scale_errors (bool, optional) – If True (default), apply scaling to errors.

apply()[source]

setup_function()[source]: Implement in services (subclasses of Stage)

pisa.stages.utils.resample.test_resample()[source]: Unit test for the resampling stage.

pisa.stages.utils.set_variance module

Override errors and replace with manually chosen error fraction.

pisa.stages.utils.set_variance.apply_floor(val, out)[source]

pisa.stages.utils.set_variance.init_test(**param_kwargs)[source]: Instantiation example

pisa.stages.utils.set_variance.set_constant(val, out)[source]

class pisa.stages.utils.set_variance.set_variance(variance_scale=1.0, variance_floor=None, expected_total_mc=None, divide_total_mc=False, **std_kwargs)[source]

Bases: Stage

Override errors and replace with manually chosen variance.

apply_function()[source]: Implement in services (subclasses of Stage)

compute_function()[source]: Implement in services (subclasses of Stage)

setup_function()[source]: Implement in services (subclasses of Stage)

pisa.stages.utils package

Submodules

pisa.stages.utils.add_indices module

pisa.stages.utils.adhoc_sys module

pisa.stages.utils.bootstrap module

pisa.stages.utils.fix_error module

pisa.stages.utils.hist module

pisa.stages.utils.kde module

pisa.stages.utils.kfold module

pisa.stages.utils.resample module

pisa.stages.utils.set_variance module

Module contents