Endless amounts of data are available nowadays. But how can you process, analyse and (re)use them safely and securely? The projects in this Labs theme explore these aspects.
Big data science driven technologies
More and more complex data is becoming available to researchers. This offers unprecedented opportunities, but processing this growing quantity of data is a challenge. In this project, we explore techniques to process, analyse and publish all those exabytes of data.
Why are we doing this project?
Exponential data growth
Big data is one of the major drivers for IT innovation. The complexity, diversity and quantity of data is increasing exponentially in almost all research fields. And we expect this to continue to increase. With more data, researchers can conduct more accurate research, but also explore entirely new research paths. For example, searching for new fundamental particles (e.g. High-Luminosity LHC), exploring the universe (e.g. Square Kilometer Array) and research into the quality of life on earth (e.g. Destination Earth).
Processing exabytes of data properly
But the growing quantity of data also poses problems for researchers. In the coming years, individual instruments will produce exabytes (1 exabyte = 1 million terabytes) of high-quality data per year. This data must be efficiently processed, analysed and published. SURF is working with researchers to make this possible for big data through IT innovation.
Optimisation and acceleration
In this multi-year project, we are working on four themes:
- Optimisation of the traditional data processing chain
- High-speed data logistics
- Accelerators and cutting-edge hardware innovation
- Solutions for federated data processing
What are the main activities?
The following activities are planned for 2022 and 2023:
- Research on high speed international data transfer. For this purpose, we are testing data transfer nodes (DTN) that work with an optical connection of 400 Gbit/s. DTNs are servers specifically intended for efficiently sending and receiving data over networks.
- Research into SURF Research Access Management (SRAM) via an LDAP link and synchronisation to set up secure and federated platforms for data processing.
- Research into the application of GPUs, DPUs (data processing units) and other accelerators in a virtualised and optimised cloud environment for data processing.
- Research into new data storage techniques, data transfer protocols and clients.
- Research into new algorithms and techniques for achieving faster and more accurate results for data-intensive workflows, including with machine learning and AI.
Who are we collaborating with?
In this project we work together with NIKHEF, ASTRON and the Centre for Information Technology (CIT) of the RUG. Are you a researcher or IT staff member at an institution affiliated to SURF? If so, you are welcome to participate in this project. For more information, please contact Raymond Oonk at firstname.lastname@example.org.
or go to servicedesk.surfsara.nl.