Nieuws

Drie winnende voorstellen voor ‘big science’ call

De 3 winnende projecten van de eTEC-BIG call willen de zoektocht naar donkere materie versnellen, een all-sky radiotelescoop en een detectiefaciliteit voor kortdurende astronomische gebeurtenissen (‘transients’) verbeteren, en een veelbelovende pangenoombenadering opschalen voor sequentiebepaling van plantengenomen.
Plaatje van de aarde met lichtflitsen

Het doel van de call Innovative eScience Technologies for ‘Big Science’ (eTEC-BIG) van SURF en het Netherlands eScience Center is het ondersteunen van de ontwikkeling van innovatieve eScience-technologieën en software voor verwerking en analyse van big data en gerelateerde computermethoden. De onderzoeksgebieden waarin behoefte is aan deze toepassingen, worden aangeduid met de term 'big science'.

De voorstellen zijn ingedeeld in een van deze 3 technologische onderzoeksrichtingen: schaalbare machine learning & AI; verwerking van streaming data; grootschalige (gedistribueerde) data-organisatie, -beheer & -semantiek.

Elk van de winnende projecten ontvangt een subsidie bestaande uit fondsen en ondersteuning door research engineers van het eScience Center en technologie- en e-infrastructuurexperts van SURFsara.

De winnende projecten

(De samenvattingen zijn overgenomen uit de voorstellen en daarom niet vertaald.)

DarkGenerators – Interpretable Large Scale Deep Generative Models for Dark Matter Searches

Dr. Christoph Weniger (University of Amsterdam)

Dark matter is five times more abundant in the universe than visible matter. Yet, its nature remains unknown and constitutes one of the most exciting and complex research questions today. This project will use advanced data science methods to enhance and accelerate the interpretation of astrophysical and collider data in the search for signals of dark matter. As such, deep generative models and differentiable probabilistic programming will be used to construct a framework for the fast and precise inference of high-dimensional data models.

Technological research direction: Scalable Machine Learning & AI

The PetaFLOP AARTFAAC Data-Reduction Engine

Dr. John Romein (Netherlands Institute for Radio Astronomy)

AARTFAAC is an all-sky radio telescope and transient-detection facility. It piggybacks on raw data from a limited number of antennas of the LOFAR telescope. Last year, the AARTFAAC 2.0 program started, which combines a planned telescope upgrade with better transient-detection capabilities and new science cases. The project will improve the AARTFAAC processing pipeline in order to: 

  • incorporate algorithmic improvements and new GPU technologies to permit scaling to larger collecting area, larger bandwidth and higher resolution;
  • detect transients well within 7 seconds to allow triggering of the TBBs and alert other instruments;
  • provide near real-time calibrated data/images for space-weather and ionospheric monitoring;
  • facilitate other science cases by providing intermediate data products.

Technological research direction: Processing of Streaming Data

Scaling up Pangenomics for Plant Breeding

Dr. Sandra Smit (Wageningen University)

Modern plant research is being transformed to a data-driven endeavor. A main driver of this development is the continuous reduction in DNA sequencing costs – reconstructing the complete genome of a plant from short DNA sequences or finding genetic variants with respect to a reference genome are applications where large amounts of sequencing data are generated and applied to study plants and to accelerate and improve breeding. Traditional approaches to compare genomes, centered on a single reference, no longer suffice and therefore the field of genomics is switching to so-called pangenome approaches. Several novel graph-based data structures and algorithms are under development, but none of these can handle the numbers of large plant genomes required in modern research and in applications in plant breeding. This project will improve the scalability of a promising pangenome approach, called PanTools, using eScience technologies. 

Technological research direction: Large-Scale (Distributed) Data Organization, Management & Semantics.