Loading data fields from data files
An analysis will need to load a set of data fields from a data file.
Which fields these are is defined in the ['datafields']
section of the
Config
dictionary instance. Each field has a stage
assigned which states at what stage the data field is required. There are
two main stages: data preparation, and analysis. Since data fields
can exist either in an experimental data file or a monte-carlo data file, these
two main stages are divided into EXP and MC. Hence, the following stages
exist:
DATAPREPARATION_EXP
DATAPREPARATION_MC
ANALYSIS_EXP
ANALYSIS_MC
All stages are defines in the skyllh.core.datafields.DataFieldStages
class.
After loading the data of a Dataset
instance,
only data fields with the stage ANALYSIS_EXP
and ANALYSIS_MC
will be
left to use in the analysis. Data fields marked with stage
DATAPREPARATION_EXP
or DATAPREPARATION_MC
will be available for the data
preparation stage.
The following code shows how to define the data fields my_exp_field
and
my_mc_field
that should be loaded from the experimental and monte-carlo data
files, respectively.
[5]:
from skyllh.core.config import Config
from skyllh.core.datafields import DataFieldStages as DFS
[6]:
cfg = Config()
cfg['datafields']['my_exp_field'] = DFS.ANALYSIS_EXP
cfg['datafields']['my_mc_field'] = DFS.DATAPREPARATION_MC
The my_exp_field
will be available after the data files have been loaded
and the data has been prepared by optional data preparation functions, whereas
the my_mc_field
will be available only at the data preparation stage and not
at the analysis stage.
Note
Everything after the
skyllh.core.dataset.Dataset.load_and_prepare_data()
call is
referred to as analysis stage.
Datasets can define their own required data fields via setting the
skyllh.core.dataset.Dataset.datafields
property in the same way as
in the configuration.