Fitting a steady point-source with the public 14-year IceCube track data

This tutorial shows how to fit a neutrino point-like steady source (time-integrated) using the IceCube public 14-year point-source data with SkyLLH. At the end of the notebook we also show how the same fit works using the older 10-year dataset.

We assume that you have installed the SkyLLH package in your environment. If that’s not the case you’ll need to add the path to the SkyLLH folder to your PYTHONPATH.

[1]:
import numpy as np

Initializing a configuration dictionary

First we have to create a config instance.

[2]:
from skyllh.core.config import Config

cfg = Config()

The Config object stores a dictionary with global settings, including computing, logging, and analysis settings. It is a required arugment to several methods.

[3]:
cfg
[3]:
{'multiproc': {'ncpu': None},
 'logging': {'log_level': 'INFO',
  'log_format': '%(asctime)s %(processName)s %(name)s %(levelname)s: %(message)s',
  'enable_tracing': False},
 'project': {'working_directory': '.'},
 'repository': {'base_path': '/Users/chiarabellenghi/.cache/skyllh',
  'download_from_origin': True},
 'units': {'internal': {'angle': Unit("rad"),
   'energy': Unit("GeV"),
   'length': Unit("cm"),
   'time': Unit("s")},
  'defaults': {'fluxes': {'angle': Unit("rad"),
    'energy': Unit("GeV"),
    'length': Unit("cm"),
    'time': Unit("s")}}},
 'datafields': {'run': 4,
  'ra': 4,
  'dec': 4,
  'ang_err': 4,
  'time': 4,
  'log_energy': 4,
  'true_ra': 8,
  'true_dec': 8,
  'true_energy': 8,
  'mcweight': 8},
 'caching': {'pdf': {'MultiDimGridPDF': False}}}

Setting up logging

The user sets up the main logger. The logger name, logging format and level, output file, and other options can be set.

Default logging format and level are defined in the Config instance, but they can be overwritten by the arguments of logging.setup_loggging.

[ ]:
from skyllh.core.logging import setup_logging

logger = setup_logging(cfg=cfg, name='fitting_a_source', log_level='INFO')

Getting the datasets

The datasets are automatically downloaded from dataverse.harvard.edu to a local cache directory (~/.skyllh/cache) on first run. To use a custom dataset location, set cfg['repository']['base_path'] to the desired path.

Note

IceTracks-DR2 is a pretty heavy dataset (~5GB) and it might take a while to initialize the analysis when it has to be downloaded for the first time. Be patient, and the next time it will be loaded from ~/.skyllh/cache :)

To create the correct dataset object, we need to import the 14-yr dataset definition:

[5]:
import skyllh

The collection of datasets is created with create_dataset_collection. With the default configuration, the data is downloaded automatically from dataverse.harvard.edu.

[9]:
help(skyllh.create_datasets)
Help on function create_datasets in module skyllh.datasets.datasets:

create_datasets(sample_name, cfg, names=None, base_path=None, sub_path_fmt=None)
    Creates a list of Dataset instances for a named data sample.

    Parameters
    ----------
    sample_name : str
        The name of the data sample. Available samples are the keys of
        ``skyllh.datasets.data_samples``.
    cfg : instance of Config
        The instance of Config holding the local configuration.
    names : sequence of str | None
        The dataset names to return. If None, the module's ``DATASET_NAMES``
        default is used (combined IC86 seasons where applicable).
    base_path : str | None
        The base path of the data files. If None, uses
        ``cfg['repository']['base_path']``.
    sub_path_fmt : str | None
        The sub path format override. If None, uses the module default.

    Returns
    -------
    datasets : list of Dataset

If no names are passed to skyllh.create_datasets, it loads all data sets that are part of the IceTracks-DR2 dataset collection.

A tutorial on how to access specific information about which datasets are included in, e.g., IceTracks-DR2 is provided in ..todo::dataset_collections tutorial.

[ ]:
datasets = skyllh.create_datasets('IceTracks-DR2', cfg=cfg)

By default, this imports all available data sets in IceTracks-DR2 (all 14 years of data).

Creating the analysis object

The analysis for doing a point-source search as in, e.g., https://doi.org/10.1103/PhysRevLett.124.051103 is pre-defined. This analysis instance can be created via the create_analysis method of the time_integrated_ps module of the public data interface.

The function requires the config, the datasets, and a source object. It also takes a number of additional optional arguments for which we refer the reader to the docstring.

[11]:
from skyllh.analyses.i3.publicdata_ps.time_integrated_ps import create_analysis

help(create_analysis)
Help on function create_analysis in module skyllh.analyses.i3.publicdata_ps.time_integrated_ps:

create_analysis(
    cfg,
    datasets,
    source,
    refplflux_Phi0=1,
    refplflux_E0=1000.0,
    refplflux_gamma=2.0,
    refplflux_Ec=inf,
    energy_range=None,
    ns_seed=10.0,
    ns_min=0.0,
    ns_max=1000.0,
    gamma_seed=3.0,
    gamma_min=1.0,
    gamma_max=4.0,
    kde_smoothing=False,
    minimizer_impl='LBFGS',
    minimizer_max_rep=100,
    compress_data=False,
    keep_data_fields=None,
    evt_sel_delta_angle_deg=10,
    construct_sig_generator=True,
    tl=None,
    ppbar=None,
    logger_name=None
)
    Creates the Analysis instance for this particular analysis.

    Parameters
    ----------
    cfg : instance of Config
        The instance of Config holding the local configuration.
    datasets : list of Dataset instances
        The list of Dataset instances, which should be used in the
        analysis.
    source : PointLikeSource instance
        The PointLikeSource instance defining the point source position.
    refplflux_Phi0 : float
        The flux normalization to use for the reference power law flux model.
    refplflux_E0 : float
        The reference energy to use for the reference power law flux model.
    refplflux_gamma : float
        The spectral index to use for the reference power law flux model.
    refplflux_Ec: float,
        The cutoff energy for the cutoff power law flux model.
    energy_range: 2-element tuple of float (low energy, high energy) | None
        The energy range for signal generation. Both low and high energies are
        given in GeV. If set to ``None``, the entire energy range of the
        dataset is used.
    ns_seed : float
        Value to seed the minimizer with for the ns fit.
    ns_min : float
        Lower bound for ns fit.
    ns_max : float
        Upper bound for ns fit.
    gamma_seed : float | None
        Value to seed the minimizer with for the gamma fit. If set to None,
        the refplflux_gamma value will be set as gamma_seed.
    gamma_min : float
        Lower bound for gamma fit.
    gamma_max : float
        Upper bound for gamma fit.
    kde_smoothing : bool
        Deprecated: use of ``kde_smoothing=True`` is deprecated and will be
            removed in a future version.
        Apply a KDE-based smoothing to the data-driven background pdf.
        Default: False.
    minimizer_impl : str
        Minimizer implementation to be used. Supported options are ``"LBFGS"``
        (L-BFG-S minimizer used from the :mod:`scipy.optimize` module),
        or ``"minuit"`` (Minuit minimizer used by the :mod:`iminuit` module).
        Default: "LBFGS".
    minimizer_max_rep : int
        In case the minimization process did not converge at the first time
        this option specifies the maximum number of repetitions with
        different initials. Default is 100.
    compress_data : bool
        Flag if the data should get converted from float64 into float32.
    keep_data_fields : list of str | None
        List of additional data field names that should get kept when loading
        the data.
    evt_sel_delta_angle_deg : float
        The delta angle in degrees for the event selection optimization methods.
    construct_sig_generator : bool
        Flag if the signal generator should be constructed (``True``) or not
        (``False``).
    tl : TimeLord instance | None
        The TimeLord instance to use to time the creation of the analysis.
    ppbar : ProgressBar instance | None
        The instance of ProgressBar for the optional parent progress bar.
    logger_name : str | None
        The name of the logger to be used. If set to ``None``, ``__name__`` will
        be used.

    Returns
    -------
    ana : instance of SingleSourceMultiDatasetLLHRatioAnalysis
        The Analysis instance for this analysis.

Defining the Source: Here we use the location of NGC 1068 as an example source.

Instantiating a PointLikeSource object required the source location (R.A. and Dec.) in radians. Optionally, the source can have a name and a weight. The latter is in principle useful when performing a stacking analysis which is not supported for public data yet.

[15]:
from skyllh.core.source_model import PointLikeSource

help(PointLikeSource)
Help on class PointLikeSource in module skyllh.core.source_model:

class PointLikeSource(SourceModel, IsPointlike)
 |  PointLikeSource(ra, dec, name=None, weight=None, **kwargs)
 |
 |  The PointLikeSource class is a source model for a point-like source
 |  object in the sky at a given location (right-ascention and declination).
 |
 |  Method resolution order:
 |      PointLikeSource
 |      SourceModel
 |      skyllh.core.model.Model
 |      IsPointlike
 |      builtins.object
 |
 |  Methods defined here:
 |
 |  __init__(self, ra, dec, name=None, weight=None, **kwargs)
 |      Creates a new PointLikeSource instance for defining a point-like
 |      source.
 |
 |      Parameters
 |      ----------
 |      ra : float
 |          The right-ascention coordinate of the source in radians.
 |      dec : float
 |          The declination coordinate of the source in radians.
 |      name : str | None
 |          The name of the source.
 |      weight : float | None
 |          The relative weight of the source w.r.t. other sources.
 |          If set to None, unity will be used.
 |
 |  __str__(self)
 |      Pretty string representation.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from SourceModel:
 |
 |  classification
 |      The astronomical classification of the source.
 |
 |  weight
 |      The weight of the source.
 |
 |  ----------------------------------------------------------------------
 |  Readonly properties inherited from skyllh.core.model.Model:
 |
 |  id
 |      (read-only) The ID of the model. It's an integer generated with
 |      Python's `id` function. Hence, it's related to the memory address
 |      of the object.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from skyllh.core.model.Model:
 |
 |  __dict__
 |      dictionary for instance variables
 |
 |  __weakref__
 |      list of weak references to the object
 |
 |  name
 |      The name of the model.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from IsPointlike:
 |
 |  dec
 |      The declination coordinate of the point-like source.
 |
 |  ra
 |      The right-ascention coordinate of the point-like source.

[16]:
src_ra = 40.67  # degrees
src_dec = -0.01  # degrees

source = PointLikeSource(ra=np.radians(src_ra), dec=np.radians(src_dec))

If not otherwise specified through the optional arguments, the time_integrated_ps.create_analysis method source assumes a signal hypothesis such that the defined source emits a powerlaw flux with spectral index \(\gamma=2.0\):

\[\phi(E_{\nu}) = \mathrm{refplflux\_Phi0} \cdot \left(\frac{E_{\nu}}{\mathrm{refplflux\_E0}}\right)^{-\mathrm{refplflux\_gamma}}\]
[17]:
ana = create_analysis(cfg=cfg, datasets=datasets, source=source)
2026-05-13 14:49:35,082 MainProcess skyllh.analyses.i3.publicdata_ps.time_integrated_ps INFO: SourceHypoGroupManager
    Source Hypothesis Groups:
        0: SourceHypoGroup:
            sources (1):
                0: PointLikeSource: "4590251664": { ra=40.670 deg, dec=-0.010 deg }
            fluxmodel:
                1.000e+00 * (E / (1000 GeV))^-2 * 1 (GeV cm^2 s)^-1
            detector signal yield builders (1):
                PDSingleParamFluxPointLikeSourceI3DetSigYieldBuilder
            signal generation method:
                NoneType
2026-05-13 14:49:35,083 MainProcess skyllh.analyses.i3.publicdata_ps.time_integrated_ps INFO: ParameterModelMapper: 2 global parameters, 2 models (1 source)
    Parameters:
        ns [floating (0 <= 10 <= 1000)]
            in models:
            - IceCube: ns

        gamma [floating (1 <= 3 <= 4)]
            in models:
            - 4590251664: gamma

  0%|          | 0/4 [00:00<?, ?it/s]2026-05-13 14:49:35,109 MainProcess skyllh.i3.dataset.I3Dataset.make_data_available INFO: Starting transfer of dataset "IC40" from origin into base path "/Users/chiarabellenghi/.cache/skyllh".
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
File ~/Work/skyllh_pre_release/skyllh/skyllh/core/dataset.py:742, in URLRetrieveDatasetTransfer.transfer(self, origin, file_list, dst_base_path, username, password)
    741 try:
--> 742     urllib.request.urlretrieve(url, dst_pathfilename)
    743 except urllib.error.URLError as err:

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:212, in urlretrieve(url, filename, reporthook, data)
    210 url_type, path = _splittype(url)
--> 212 with contextlib.closing(urlopen(url, data)) as fp:
    213     headers = fp.info()

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:187, in urlopen(url, data, timeout, context)
    186     opener = _opener
--> 187 return opener.open(url, data, timeout)

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:493, in OpenerDirector.open(self, fullurl, data, timeout)
    492     meth = getattr(processor, meth_name)
--> 493     response = meth(req, response)
    495 return response

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:602, in HTTPErrorProcessor.http_response(self, request, response)
    601 if not (200 <= code < 300):
--> 602     response = self.parent.error(
    603         'http', request, response, code, msg, hdrs)
    605 return response

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:525, in OpenerDirector.error(self, proto, *args)
    524 args = (dict, proto, meth_name) + args
--> 525 result = self._call_chain(*args)
    526 if result:

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:464, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
    463 func = getattr(handler, meth_name)
--> 464 result = func(*args)
    465 if result is not None:

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:1026, in HTTPBasicAuthHandler.http_error_401(self, req, fp, code, msg, headers)
   1025 url = req.full_url
-> 1026 response = self.http_error_auth_reqed('www-authenticate',
   1027                                   url, req, headers)
   1028 return response

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:975, in AbstractBasicAuthHandler.http_error_auth_reqed(self, authreq, host, req, headers)
    971         if realm is not None:
    972             # Use the first matching Basic challenge.
    973             # Ignore following challenges even if they use the Basic
    974             # scheme.
--> 975             return self.retry_http_basic_auth(host, req, realm)
    977 if unsupported is not None:

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:990, in AbstractBasicAuthHandler.retry_http_basic_auth(self, host, req, realm)
    989     req.add_unredirected_header(self.auth_header, auth)
--> 990     return self.parent.open(req, timeout=req.timeout)
    991 else:

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:493, in OpenerDirector.open(self, fullurl, data, timeout)
    492     meth = getattr(processor, meth_name)
--> 493     response = meth(req, response)
    495 return response

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:602, in HTTPErrorProcessor.http_response(self, request, response)
    601 if not (200 <= code < 300):
--> 602     response = self.parent.error(
    603         'http', request, response, code, msg, hdrs)
    605 return response

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:531, in OpenerDirector.error(self, proto, *args)
    530 args = (dict, 'default', 'http_error_default') + orig_args
--> 531 return self._call_chain(*args)

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:464, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
    463 func = getattr(handler, meth_name)
--> 464 result = func(*args)
    465 if result is not None:

File /opt/homebrew/Cellar/python@3.14/3.14.2/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py:611, in HTTPDefaultErrorHandler.http_error_default(self, req, fp, code, msg, hdrs)
    610 def http_error_default(self, req, fp, code, msg, hdrs):
--> 611     raise HTTPError(req.full_url, code, msg, hdrs, fp)

HTTPError: HTTP Error 401: Unauthorized

The above exception was the direct cause of the following exception:

DatasetTransferError                      Traceback (most recent call last)
Cell In[17], line 1
----> 1 ana = create_analysis(cfg=cfg, datasets=datasets, source=source)

File ~/Work/skyllh_pre_release/skyllh/skyllh/analyses/i3/publicdata_ps/time_integrated_ps.py:335, in create_analysis(cfg, datasets, source, refplflux_Phi0, refplflux_E0, refplflux_gamma, refplflux_Ec, energy_range, ns_seed, ns_min, ns_max, gamma_seed, gamma_min, gamma_max, kde_smoothing, minimizer_impl, minimizer_max_rep, compress_data, keep_data_fields, evt_sel_delta_angle_deg, construct_sig_generator, tl, ppbar, logger_name)
    333 pbar = ProgressBar(len(datasets), parent=ppbar).start()
    334 for ds_idx, ds in enumerate(datasets):
--> 335     data = ds.load_and_prepare_data(
    336         keep_fields=keep_data_fields, dtc_dict=dtc_dict, dtc_except_fields=dtc_except_fields, tl=tl
    337     )
    339     sin_dec_binning = ds.get_binning_definition('sin_dec')
    340     log_energy_binning = ds.get_binning_definition('log_energy')

File ~/Work/skyllh_pre_release/skyllh/skyllh/core/dataset.py:1887, in Dataset.load_and_prepare_data(self, livetime, keep_fields, dtc_dict, dtc_except_fields, efficiency_mode, tl)
   1884     raise TypeError('The keep_fields argument must be None, or a sequence of str!')
   1885 keep_fields = list(keep_fields)
-> 1887 data = self.load_data(
   1888     keep_fields=keep_fields,
   1889     livetime=livetime,
   1890     dtc_dict=dtc_dict,
   1891     dtc_except_fields=dtc_except_fields,
   1892     efficiency_mode=efficiency_mode,
   1893     tl=tl,
   1894 )
   1896 self.prepare_data(data, tl=tl)
   1898 # Drop non-required data fields.

File ~/Work/skyllh_pre_release/skyllh/skyllh/i3/dataset.py:289, in I3Dataset.load_data(self, livetime, keep_fields, dtc_dict, dtc_except_fields, efficiency_mode, tl)
    240 """Loads the data, which is described by the dataset. If a good-run-list
    241 (GRL) is provided for this dataset, only experimental data will be
    242 selected which matches the GRL.
   (...)    285     data of this data set.
    286 """
    287 # Load the dataset files first. This will ensure the dataset is
    288 # downloaded if necessary.
--> 289 data_ = super().load_data(
    290     livetime=livetime,
    291     keep_fields=keep_fields,
    292     dtc_dict=dtc_dict,
    293     dtc_except_fields=dtc_except_fields,
    294     efficiency_mode=efficiency_mode,
    295     tl=tl,
    296 )
    298 # Load the good-run-list (GRL) data if it is provided for this dataset,
    299 # and calculate the livetime based on the GRL.
    300 data_grl = None

File ~/Work/skyllh_pre_release/skyllh/skyllh/core/dataset.py:1623, in Dataset.load_data(self, livetime, keep_fields, dtc_dict, dtc_except_fields, efficiency_mode, tl)
   1620     return orig_field_names
   1622 if self._cfg['repository']['download_from_origin'] is True:
-> 1623     self.make_data_available()
   1625 if keep_fields is None:
   1626     keep_fields = []

File ~/Work/skyllh_pre_release/skyllh/skyllh/core/dataset.py:1535, in Dataset.make_data_available(self, username, password)
   1531     file_list = [os.path.join(self.origin.sub_path, self.origin.filename)]
   1533 logger.info(f'Starting transfer of dataset "{self.name}" from origin into base path "{base_path}".')
-> 1535 self.origin.transfer_func(
   1536     origin=self.origin, file_list=file_list, dst_base_path=base_path, username=username, password=password
   1537 )
   1539 if self.origin.post_transfer_func is not None:
   1540     self.origin.post_transfer_func(ds=self, dst_path=base_path)

File ~/Work/skyllh_pre_release/skyllh/skyllh/core/dataset.py:746, in URLRetrieveDatasetTransfer.transfer(self, origin, file_list, dst_base_path, username, password)
    744     if os.path.exists(dst_pathfilename):
    745         os.remove(dst_pathfilename)
--> 746     raise DatasetTransferError(str(err)) from err
    747 finally:
    748     urllib.request.install_opener(urllib.request.build_opener())

DatasetTransferError: HTTP Error 401: Unauthorized

Understanding how the analysis works

One usually wants to get best-fit values (maximum likelihood estimators) for the source parameters and a test-statistic, using methods like ana.do_trial or ana.unblind (see below). However, here we describe what happens under the hood when the analysis is run either on simulated or real experimental data.

Initializing a trial

After the Analysis instance was created, the point-source analysis can be run. To do so the analysis is initialized with some trial data. For instance we could initialize the analysis with the experimental data but the same applies for simulated data.

The Analysis object provides the method initialize_trial to initialize a trial with data. It takes a list of DataFieldRecordArray instances holding the events. If we want to initialize a trial with the experimental data, we can get that list from the Analysis instance itself:

[19]:
events_list = [data.exp for data in ana.data_list]
ana.initialize_trial(events_list)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[19], line 1
----> 1 events_list = [data.exp for data in ana.data_list]
      2 ana.initialize_trial(events_list)

NameError: name 'ana' is not defined

Maximizing the log-likelihood ratio function

After initializing a trial, we can maximize the LLH ratio function using the ana.llhratio.maximize method. This method requires a RandomStateService instance in case the minimizer does not succeed and a new set of initial values for the fit parameters need to get generated. The method returns a 3-element tuple. The first element is the value of the LLH ratio function at its maximum. The second element is the set of fit parameters that maximize the LLH ratio. The third element is the status dictionary of the minimizer.

[20]:
from skyllh.core.random import RandomStateService

rss = RandomStateService(seed=1)
[11]:
(log_lambda_max, fitparam_values, status) = ana.llhratio.maximize(rss)
[12]:
print(f'log_lambda_max = {log_lambda_max}')
print(f'fitparam_values = {fitparam_values}')
print(f'status = {status}')
log_lambda_max = 14.576758408087755
fitparam_values = [80.09381868  3.20884803]
status = {'grad': array([-3.62369895e-06, -2.31360821e-04]), 'task': 'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH', 'funcalls': 13, 'nit': 10, 'warnflag': 0, 'skyllh_minimizer_n_reps': 0, 'n_llhratio_func_calls': 13}

Calculating the test-statistic

Using the maximum of the LLH ratio function and the fit parameter values at the maximum we can calculate the test-statistic using the ana.calculate_test_statistic method:

[13]:
TS = ana.calculate_test_statistic(log_lambda_max, fitparam_values)
print(f'TS = {TS:.3f}')
TS = 29.154

Unblinding the data

Above, we perform the analysis following pedagogical steps: we initialize the analysis with a trial of the experimental data, maximize the log-likelihood ratio function for all given experimental data events, and calculate the test-statistic value.

However, the same calculations can be performed all togethe using the ana.unblind method of the analysis instance, which returns the test statistic and the best fit parameters directly.

Note that the ana.unblind method runs the analysis on the experimental data.

The methods to generate and analyze pseudo-data are illustrated in the generating_pseudo_experiments.ipynb notebook.

[17]:
help(ana.unblind)
Help on method unblind in module skyllh.core.analysis:

unblind(minimizer_rss, tl=None) method of skyllh.core.analysis.SingleSourceMultiDatasetLLHRatioAnalysis instance
    Evaluates the unscrambled data, i.e. unblinds the data.

    Parameters
    ----------
    minimizer_rss : instance of RandomStateService
        The instance of RandomStateService that should be used by the
        minimizer to generate new random initial fit parameter values.
    tl : instance of TimeLord | None
        The optional instance of TimeLord that should be used to time the
        maximization of the LLH ratio function.

    Returns
    -------
    TS : float
        The test-statistic value.
    global_params_dict : dict
        The dictionary holding the global parameter names and their
        best fit values. It includes fixed and floating parameters.
    status : dict
        The status dictionary with information about the performed
        minimization process of the negative of the log-likelihood ratio
        function.

[ ]:
from skyllh.core.random import RandomStateService

rss = RandomStateService(seed=1)

(ts, fitparam_values, status) = ana.unblind(rss)
[ ]:
print(f'TS = {ts:.3f}')
print(f'ns = {fitparam_values["ns"]:.2f}')
print(f'gamma = {fitparam_values["gamma"]:.2f}')
status
TS = 29.154
ns = 80.09
gamma = 3.21
{'grad': array([-3.62369979e-06, -2.31360814e-04]),
 'task': 'CONVERGENCE: RELATIVE REDUCTION OF F <= FACTR*EPSMCH',
 'funcalls': 13,
 'nit': 10,
 'warnflag': 0,
 'skyllh_minimizer_n_reps': 0,
 'n_llhratio_func_calls': 13}

Converting the fit parameters into a flux normalization

..todo: Tomas can you please a function that takes a specific flux param to get the conversion factor?

..todo: Further utility functions are available,

  1. ana.mu2flux(mu) returns a flux normalization at the chosen spectral index, chosen during the initialisation of the analysis instance.

  2. ana.flux2mu(flux_norm) returns the mean number of signal events for the given flux normalization value at the chosen spectral index.

By default the analysis is created with a flux normalization of 1 \(\text{GeV}^{-1}\text{s}^{-1}\text{cm}^{-2}\text{sr}^{-1}\) at 1 TeV (see refplflux_Phi0 and refplflux_E0 arguments of the create_analysis method). The analysis instance has the method calculate_fluxmodel_scaling_factor that calculates the scaling factor the reference flux normalization has to be multiplied with to represent a given analysis result, i.e. \(n_{\text{s}}\) and \(\gamma\) value. This function takes the detected mean \(n_{\text{s}}\) value as first argument and the list of source parameter values as second argument:

[ ]:
scaling_factor = ana.calculate_fluxmodel_scaling_factor(
    fitparam_values['ns'], [fitparam_values['ns'], fitparam_values['gamma']]
)
print(f'Flux scaling factor = {scaling_factor:.3e}')
Flux scaling factor = 2.869e-14

Hence, our result corresponds to a power-law flux of:

[ ]:
print(f'{scaling_factor:.3e} (E/1000 GeV)^{{ -{fitparam_values["gamma"]:.2f} }} 1/(GeV s cm^2 sr)')
2.869e-14 (E/1000 GeV)^{-3.21} 1/(GeV s cm^2 sr)

Illustrating the difference of fits between the 10-year and the 14-year dataset. To compute the test statistic and the best fit parameters for the 10-year dataset, the same workflow as used for the 14-year dataset.

[22]:
from skyllh.core.config import Config

cfg_10 = Config()
[23]:
from skyllh.datasets.i3.PublicData_10y_ps import create_dataset_collection
[ ]:
dsc = create_dataset_collection(cfg=cfg_10)

datasets = dsc['IC40', 'IC59', 'IC79', 'IC86_I', 'IC86_II-VII']
[25]:
from skyllh.analyses.i3.publicdata_ps.time_integrated_ps import create_analysis
from skyllh.core.source_model import PointLikeSource

src_ra = 40.67  # degrees
src_dec = -0.01  # degrees

source = PointLikeSource(ra=np.radians(src_ra), dec=np.radians(src_dec))
ana = create_analysis(cfg=cfg_10, datasets=datasets, source=source)

events_list = [data.exp for data in ana.data_list]
100%|██████████| 33/33 [00:11<00:00,  2.80it/s]
100%|██████████| 33/33 [00:09<00:00,  3.33it/s]
100%|██████████| 33/33 [00:10<00:00,  3.00it/s]
100%|██████████| 33/33 [00:10<00:00,  3.17it/s]
100%|██████████| 33/33 [00:11<00:00,  2.76it/s]
100%|██████████| 5/5 [01:40<00:00, 20.13s/it]
100%|██████████| 170/170 [00:00<00:00, 509.21it/s]
[ ]:
from skyllh.core.random import RandomStateService

rss = RandomStateService(seed=1)

(ts, x, status) = ana.unblind(rss)
print(f'TS = {ts:.3f}')
print(f'ns = {x["ns"]:.2f}')
print(f'gamma = {x["gamma"]:.2f}')
## Calculating the corresponding flux normalization
scaling_factor = ana.calculate_fluxmodel_scaling_factor(x['ns'], [x['ns'], x['gamma']])
print(f'Flux scaling factor = {scaling_factor:.3e}')

print(f'{scaling_factor:.3e} (E/1000 GeV)^{{ -{x["gamma"]:.2f} }} 1/(GeV s cm^2 sr)')
TS = 19.379
ns = 54.75
gamma = 3.16
Flux scaling factor = 3.017e-14
3.017e-14 (E/1000 GeV)^{-3.16} 1/(GeV s cm^2 sr)