Machine learning research

We are investigating whether machine learning can improve traditional high performance computing. We also investigate scalable ways of training neural networks, including in the field of image recognition. This project is part of the SURF Open Innovation Lab. Machine learning involves a computer learning independently from data and input.

Cactus

Improving HPC applications with deep learning

The main workloads running on a supercomputer typically consist of various forms of numerical simulations. Scientists have now started applying machine learning techniques to improve traditional computational simulations, such as weather forecasting. Initial results indicate that these models, which combine machine learning and traditional simulation, can potentially improve accuracy and speed.

In this project, we investigate whether and how machine learning and deep learning are suitable to improve, speed up or replace scientific workloads, such as numerical simulations. The belief is that as scientists become more familiar with this new approach and as methodologies become more robust, machine learning has the potential to become the standard tooling for many scientific fields.

Use cases

To test the potential of this approach, we are supporting a number of use cases where traditional HPC simulations are enhanced with machine learning algorithms. We do this in close cooperation with scientific research groups. Four research proposals have been selected in different scientific fields:

  • Chiel van Heerwaarden (WUR): Machine-learning turbulence in next-generation weather models
  • Sascha Caron (Radboud University): The Quantum-Event-Creator: Generating physics events without an event generator
  • Alexandre Bonvin (Utrecht University): 3DeepFace: Distinguishing biological interfaces from crystal artifacts in biomolecular complexes using deep learning
  • Simon Portegies Zwart (Leiden University): Machine learning for accelerating planetary dynamics in stellar clusters

Publications

Scalable high-performance training of deep neural networks

Caffe is one of the most popular deep learning frameworks for image recognition. Intel has contributed to this framework by improving the performance of Caffe on Intel Xeon processors. The goal of this project is to improve the scalability of Intel's Caffe performance on supercomputing systems for large-scale neural network training.

Our focus is on highly scalable high-performance training of deep neural networks, and its application to various scientific challenges. Think lung disease diagnosis, plant classification and high-energy physics. For example, we are working on porting large-batch Stochastic Gradient Descent (SGD) training techniques to the popular Tensorflow framework. Our particular focus is on the rapidly developing field of medical imaging. Because of the huge amounts of data, this field needs large bandwidth and capacity for computation and memory.

We have already succeeded in minimising the time-to-train of several deep convolutional neural networks on state-of-the-art computer vision datasets such as ImageNet and beyond. Some of the highlights of 2017 included: less than 30 minutes of training time on the popular Imagenet-1K dataset, as well as state-of-the-art results in terms of accuracy on other datasets, such as the full ImageNet and Places-365.

Publications

SURF project team

Valeriu Codreanu

Damian Podareanu

Caspar van Leeuwen