D3.3 Analysis Report on Standards and Libraries

One of the main priorities of the OBELICS work package is to enable interoperability and software re-use for the data generation, integration and analysis of the ASTERICS ESFRI and pathfinder facilities.

Work has been carried out to identify synergies between experiments that could foster the use of open data standards enabling interoperability and software re-use. To achieve this, OBELICS firstly identified and then classified ESFRI projects and pathfinder facilities according to the type of data they produce: image-based, event-based and signal-based experiments.

Image-based experiments

Three projects were identified: ground-based telescopes E-ELT and LSST and satellite mission Euclid. All these projects stream and process large amounts of data that will be archived and made available in restricted form through LSST Data Management System and in public form through the worldwide known FITS open standard.

Event-based experiments

Projects identified in this area include IACT gamma-ray observatories and neutrino observatories. “Event data” in this context refers to all the recorded data associated to a given cosmic particle detected by the observatory.

These observatories require a very high storage rate and look for unified standards, usually based on the widespread FITS file format, for higher level data ready for distribution.
In particular, the FITS format can be used to process CTA data level 3 (DL3), the first data level being nearly independent of the particularities of the detector and thus the most suitable for science analysis.

IACT observatories are integrating themselves into the Virtual Observatory framework and are already making their final science products and some legacy observatory data (e.g., light curves, spectra and source catalogues) in formats compliant with VO tools available. Neutrino observatories are also making their high-level data available through ASCII tables, dedicated web-pages and open formats based on python.

Signal-based experiments

Projects identified in this area include SKA, EVN and LOFAR. EVN is made of 21 reflector telescope (dishes) across Europe and beyond and make its data publicly available through its archive once the 12-month proprietary period has expired. LOFAR consists of about 7000 omni-directional antennas make data publicly available through its archive. The SKA project is currently on its development phase and will consist of many thousands of connected radio telescopes. The file formats to be used to store data are still to be defined.

Possible synergies

High level data – More and more experiments are making their science-ready data products publicly available in formats compliant with the VO standards. The VO framework is already allowing gamma-ray observatories to share science-ready data products while other astroparticle observatories are developing systems to communicate real-time alerts to follow-up observers using the VOEvent protocol within the VO framework.

A common standard based on the FITS file format is currently being developed for the DL3 level data currently adopted in the context of gamma-ray astronomy; this could be easily extended to neutrino and cosmic-ray observatories.

Low level data – The use of common formats for low-level data is less clear: in this case synergies between experiments are most likely limited to those belonging to the same category.  Most current event-based experiments use ROOT-based data formats and analysis tools, but no standards have been defined yet. Nevertheless, possible synergies between experiments producing similar type of data are found. In particular, low and mid-level data generated by all event-based experiments have a hierarchical structure and therefore hierarchical formats as HDF5 are suitable for them.

Next steps of the ASTERICS project

An objective of ASTERICS is to enable interoperability between Research Infrastructures, ESFRI projects and international experiments in the field of astronomy, astrophysics and astroparticle physics. This is an important step in supporting the development of the European Open Science Cloud (EOSC) and in global movement for open science data.

Up until April 2019, as a continuation of this work, ASTERICS will:

  • contribute to the development and implementation of the VO framework by organising technology forums and training events through the DADI Work Package;
  • explore the extension of the unified FITS file format to event-based experiments (i.e., neutrino and cosmic-ray observatories) and potentially to gravitational wave detectors through the UCM-ASTERICS group;
  • propose and test a unified HDF5 format for raw data of CTA as an alternative to existing ROOT-based formats, through the UCM-ASTERICS group;