Loading data fields from data files

An analysis will need to load a set of data fields from a data file. Which fields these are is defined in the ['datafields'] section of the Config dictionary instance. Each field has a stage assigned which states at what stage the data field is required. There are two main stages: data preparation, and analysis. Since data fields can exist either in an experimental data file or a monte-carlo data file, these two main stages are divided into EXP and MC. Hence, the following stages exist:

DATAPREPARATION_EXP
DATAPREPARATION_MC
ANALYSIS_EXP
ANALYSIS_MC

All stages are defines in the skyllh.core.datafields.DataFieldStages class.

After loading the data of a Dataset instance, only data fields with the stage ANALYSIS_EXP and ANALYSIS_MC will be left to use in the analysis. Data fields marked with stage DATAPREPARATION_EXP or DATAPREPARATION_MC will be available for the data preparation stage.

The following code shows how to define the data fields my_exp_field and my_mc_field that should be loaded from the experimental and monte-carlo data files, respectively.

[5]:
from skyllh.core.config import Config
from skyllh.core.datafields import DataFieldStages as DFS
[6]:
cfg = Config()
cfg['datafields']['my_exp_field'] = DFS.ANALYSIS_EXP
cfg['datafields']['my_mc_field'] = DFS.DATAPREPARATION_MC

The my_exp_field will be available after the data files have been loaded and the data has been prepared by optional data preparation functions, whereas the my_mc_field will be available only at the data preparation stage and not at the analysis stage.

Note

Everything after the skyllh.core.dataset.Dataset.load_and_prepare_data() call is referred to as analysis stage.

Datasets can define their own required data fields via setting the skyllh.core.dataset.Dataset.datafields property in the same way as in the configuration.