SURF aims to help fulfil the promise of AI in many different sciences, by supporting individual institutions and gathering expertise in this promising technology. Now, the Dutch medical institutions are about to pool their data through the introduction of federated learning.
What is the current challenge we are looking at?
Sharing and pooling data
The quality of a machine learning model is proportional to the amount of data it is fed. While the likes of Google and Facebook have plenty of data to feed their models, they are exceptions in the world of artificial intelligence. To get meaningful results, you often have to pool data from many different organisations. This is a significant hurdle for the progress of AI.
Privacy issues and competition
A prime example is the medical sector: given sufficient data, AI might find answers to the most challenging medical questions much, much faster than we’re doing now. Just think of new diseases like covid-19. But hospitals are bound to keep the data of their patients secret. Apart from privacy issues, competition too can stop organisations from sharing their data. The result is the same: models that aren’t as accurate as they might have been. And the lack of cooperation also adds to the scarcity of experienced AI-developers.
What are we working on this project?
Personal Health Train
Over the last years, a solution has emerged in the form of federated learning. In this approach AI models ‘visit’ individual organisations to get trained on their data. So they act as a trusted third party. The Personal Health Train will do just that. This solution, part of the Dutch Health RI initiative, consists of several elements. Next to the actual model, which could be taken from the many existing AI models, there’s server software that runs at the individual institutions. The server receives the ‘train’ that carries the model, tweaks the model a bit based on the data of the institute, and sends the train – with the improved model – on its way again.
SURF helping out: expertise and computing power
The Personal Health Train looks very promising, not only for the medical sector but ultimately also for other sciences. However, depending on the complexity of the AI model and the amount of data, it may require considerable computing power and expertise at the individual institutions to set up and run a server for the system.
Cooperation with Erasmus MC
SURF is already involved in the Health RI initiative. Together with Erasmus MC, it is now looking at use cases for helping out institutions that don’t have enough capacity of their own. This could be done by processing the data at SURF: lots of research data are already processed at SURF under the custody of their owners, so no one else has access to them.
What is the wider potential of federated learning?
By helping to set up and run Personal Health Train servers, SURF can build up expertise that could ultimately be used to support different sciences in pooling data resources for AI. We see another promise of federated learning: it scales more easily, because computing resources of multiple institutions are combined to deal with larger amounts of data. And the technology has potential for the Internet of Things too: instead of moving lots of raw data over very narrow bandwidths, models may be trained at the endpoints and then send the resulting parameters to be compiled.
What are the main activities?
- Running servers at SURF as nodes in the Personal Health Train
- Building up expertise in federated learning
Who are we collaborating with?