Data management & data processing

Sensitive data management in practice

What does a secure and trusted environment for working with sensitive data look like? 

Together with a wide range of collaborative partners, SURF is building a framework for sensitive data management that supports the process to join guidelines, formulate general standards and facilitating shared systems. In this SURF has a connecting role as a neutral and trusted partner bringing collaborative partners together.

In this process, we have identified eight process steps that are generally desirable for most sensitive data workflows that we have encountered:

  1. Publish metadata
  2. Find data
  3. Request access
  4. Request Trusted Research Environment (TRE) project
  5. Transfer data
  6. Process data
  7. Output check
  8. Publish results 

For each process step, we provide an explanation and highlight the projects in which SURF is collaborating.

1. Publish metadata 

The process begins with making data available via a metadata portal. There is a large number of organisations such as governments, companies, knowledge institutions and even citizens that make datasets available. They often do this via a metadata provider. This provider collects data from various sources and manages and structures it by adding the appropriate metadata to make information easier to find, understandable and useable.

2. Find data

For the researcher the journey starts with finding the right and suitable data for their research. A searchable metadata portal is a most common place for researchers to explore which data exists and how one can apply for access to them.  

A metadata portal is a web page or application designed to manage and present metadata. It helps organisations organise, make searchable and manage their digital data more effectively.  

We list a few metadata portals: 

  • The ODISSEI Metadata portal: an open data infrastructure for social science and economic Innovations. It offers a secure HPC enclave to work with CBS Microdata.
  • The CLARIAH Media Suite: a common infrastructure for the humanities and social sciences.
  • Health-RI: the Dutch National Health Data Catalogue for health and life science data.

3. Request access

When the right dataset is found and it contains sensitive data, this dataset can’t simply be downloaded. Researchers must submit an access request that explains: 

  • Who they are
  • What they want to use the data for
  • How they plan to protect it

This request is assessed by the data provider, often with help from services like the Data Access Broker. A Data Access Broker simplifies the process of obtaining datasets for research or analysis purposes. It acts as an intermediary between researchers and various data sources and simplifies this step by: 

  • Standardising access procedures
  • Enabling automation where possible
  • Giving providers tools to review and approve requests

We list a few data access brokers: 

  • SSHOC-NL: Digital Infrastructure for Social Sciences and Humanities
  • Health Data Access Body - NL: This entity is currently under development and should be operational for the Netherlands from 2029 onwards. All European countries will have their own Data Access Body.

4. Request Trusted Research Environment (TRE) project 

To request a Trusted Research Environment (TRE) project, the researcher must apply through a specific TRE provider, outlining the research benefit, undergo rigorous checks, and gain approval to access sensitive data remotely within a secure environment where data never leaves, ensuring only vetted results are exported.

5. Transfer data

In most sensitive data workflows, the data moves into a controlled environment, or the researcher gets secure access to the environment where the data already lives. There are diverse common options to give access to the data: 

  1. Uploading data into a trusted research environment where the researcher logs in securely. A Trusted Research Environment, also known as a Secure Data Environment or Data Safe Haven, is a secure, digital environment in which researchers can access sensitive data for scientific research.
  2. Granting access to existing data in-place via secure connection.
  3. Only in low-risk cases: allowing controlled downloads (e.g. pseudonymised, aggregated data under license).  

SURF provides serval services that support secure data movement: 

  • SURFfilesender: For secure and encrypted file transfers
  • Research Drive: Collaborative storage for research projects under policy controls
  • SURFdrive: Personal cloud storage under policy controls with a maximum of 1 TB

6. Process data

The next step is that the processing of the research data takes place in a trusted research environment: a secure virtual workspace with strict access controls. This way trusted research environments make sure that: 

  • Only authorized users can access the data
  • All activity is logged and auditable
  • Data cannot leave the environment without approval 

SURF provides several trusted research environments for different research needs: 

  • SANE – flexible, cloud-based trusted research environment on the SURF Research Cloud
  • OSSC – high-performance trusted research environment on the Snellius supercomputer
  • Alzheimer genetics hub – dedicated secure cluster for Alzheimer’s genetics research

These environments are ISO27001-certified and designed to meet both researcher workflows and data provider requirements.

7. Output check

When the analysis is ready, researchers want to take their results, in the format of graphs, summaries or models out of the trusted research environments. But those outputs may still carry sensitive traces. That is why most trusted research environments include a step called output control. Before anything leaves the environment, it is checked by the data provider or a designated reviewer to ensure:

  • No personal data is exposed
  • Results are sufficiently aggregated or anonymized
  • Output aligns with the approved use 

This step helps data providers stay in control, while still allowing researchers to publish meaningful insights.

8. Publish results

Once the output has been checked, the results will be disclosed. This can either be done by sending the results to the researcher, or perhaps it could be that the results will become part of the catalogue of datasets to be reused in further research.  

SURF offers these services for publishing and storing data