{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Loading data fields from data files"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "An analysis will need to load a set of data fields from a data file. \n",
    "Which fields these are is defined in the ``['datafields']`` section of the\n",
    ":py:class:`~skyllh.core.config.Config` dictionary instance. Each field has a stage\n",
    "assigned which states at what stage the data field is required. There are\n",
    "two main stages: data preparation, and analysis. Since data fields\n",
    "can exist either in an experimental data file or a monte-carlo data file, these\n",
    "two main stages are divided into EXP and MC. Hence, the following stages \n",
    "exist::\n",
    "\n",
    "    DATAPREPARATION_EXP\n",
    "    DATAPREPARATION_MC\n",
    "    ANALYSIS_EXP\n",
    "    ANALYSIS_MC\n",
    "\n",
    "All stages are defines in the :py:class:`skyllh.core.datafields.DataFieldStages`\n",
    "class."
   ]
  },
  {
   "attachments": {},
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "After loading the data of a :py:class:`~skyllh.core.dataset.Dataset` instance, \n",
    "only data fields with the stage ``ANALYSIS_EXP`` and ``ANALYSIS_MC`` will be\n",
    "left to use in the analysis. Data fields marked with stage \n",
    "``DATAPREPARATION_EXP`` or ``DATAPREPARATION_MC`` will be available for the data\n",
    "preparation stage. "
   ]
  },
  {
   "attachments": {},
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "The following code shows how to define the data fields ``my_exp_field`` and \n",
    "``my_mc_field`` that should be loaded from the experimental and monte-carlo data \n",
    "files, respectively."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "from skyllh.core.config import Config\n",
    "from skyllh.core.datafields import DataFieldStages as DFS"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "cfg = Config()\n",
    "cfg['datafields']['my_exp_field'] = DFS.ANALYSIS_EXP\n",
    "cfg['datafields']['my_mc_field'] = DFS.DATAPREPARATION_MC"
   ]
  },
  {
   "attachments": {},
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "The ``my_exp_field`` will be available after the data files have been loaded\n",
    "and the data has been prepared by optional data preparation functions, whereas\n",
    "the ``my_mc_field`` will be available only at the data preparation stage and not\n",
    "at the analysis stage.\n",
    "\n",
    ".. note::\n",
    "\n",
    "    Everything after the \n",
    "    :py:meth:`skyllh.core.dataset.Dataset.load_and_prepare_data` call is \n",
    "    referred to as analysis stage. \n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Datasets can define their own required data fields via setting the \n",
    ":py:attr:`skyllh.core.dataset.Dataset.datafields` property in the same way as \n",
    "in the configuration."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.6"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}