Dataset collectionsΒΆ
A DatasetCollection instance describes a collection of different datasets that can be combined in one analysis.
As usual, we need to instantiate a Config first
[1]:
from skyllh.core.config import Config
cfg = Config()
The available data samples are accessible through the skyllh.data_samples dictionary
[2]:
from skyllh.datasets import data_samples
data_samples
[2]:
{'IceTracks-DR1': <module 'skyllh.datasets.i3.PublicData_10y_ps' from '/Users/chiarabellenghi/Work/skyllh_pre_release/skyllh/skyllh/datasets/i3/PublicData_10y_ps.py'>,
'IceTracks-DR2': <module 'skyllh.datasets.i3.PublicData_14y_ps' from '/Users/chiarabellenghi/Work/skyllh_pre_release/skyllh/skyllh/datasets/i3/PublicData_14y_ps.py'>,
'IceTracks-DR1_wMC': <module 'skyllh.datasets.i3.PublicData_10y_ps_wMC' from '/Users/chiarabellenghi/Work/skyllh_pre_release/skyllh/skyllh/datasets/i3/PublicData_10y_ps_wMC.py'>,
'TestData': <module 'skyllh.datasets.i3.TestData' from '/Users/chiarabellenghi/Work/skyllh_pre_release/skyllh/skyllh/datasets/i3/TestData.py'>}
To access the IceTracks-DR2 dataset collection:
[3]:
dsc = data_samples['IceTracks-DR2'].create_dataset_collection(cfg=cfg)
The dataset_names property provides a list of all the data sets defined in the dataset collection.
[4]:
dsc.dataset_names
[4]:
['IC40',
'IC59',
'IC79',
'IC86_I',
'IC86_I-XI',
'IC86_II',
'IC86_III',
'IC86_IV',
'IC86_IX',
'IC86_V',
'IC86_VI',
'IC86_VII',
'IC86_VIII',
'IC86_X',
'IC86_XI']
When importing data to create an analysis object, one can decide which specific data sets to use:
[5]:
import skyllh
datasets = skyllh.create_datasets('IceTracks-DR2', cfg=cfg, names=['IC86_XI'])
datasets
[5]:
[<skyllh.i3.dataset.I3Dataset at 0x110132b10>]
Information about the data being selected can be printed to console for each of the selected data sets:
[6]:
print(datasets[0])
Dataset "IC86_XI": v001p00
{ livetime = UNDEFINED }
Experimental data:
[FOUND] /Users/chiarabellenghi/.cache/skyllh/icecube_14year_ps/events/IC86_XI_exp.csv
MC data:
Auxiliary data:
eff_area_datafile:
[FOUND] /Users/chiarabellenghi/.cache/skyllh/icecube_14year_ps/irfs/IC86_effectiveArea.csv
smearing_datafile:
[FOUND] /Users/chiarabellenghi/.cache/skyllh/icecube_14year_ps/irfs/IC86_smearing.csv
GRL data:
[FOUND] /Users/chiarabellenghi/.cache/skyllh/icecube_14year_ps/uptime/IC86_XI_exp.csv
If no names are passed to the skyllh.create_datasets method, the combination of all available data sets in the dataset collection will be imported.
The default datasets combination is defined in, e.g.:
[7]:
data_samples['IceTracks-DR2'].DATASET_NAMES
[7]:
('IC40', 'IC59', 'IC79', 'IC86_I-XI')
Note
The dataset collection contains all available datasets. For performance reasons we concatenate multiple seasons with the same detector configuration into one dataset. For example, the IC86_I-XI dataset contains all seasons from IC86_I to IC86_XI. The individual seasons are still available as separate datasets by specifying their names for the names argument.