Research Infrastructure
All researchers should be able to focus on their research without any worries because they can use state-of-the-art IT facilities and expertise. Both researchers who have been using large-scale digital infrastructure for years, and those with less experience in this.
Onderzoek met geavanceerde ict

A climate model is a kind of digital copy of the Earth. This kind of research is only really possible with the help of a supercomputer.

Leo van Kampenhout, Utrecht University

SANE: secure environment for analysing sensitive data

Privacy, copyright and competition barriers limit the sharing of sensitive data for scientific purposes. Together with several partners, SURF is working on a Secure ANalysis Environment (SANE). A secured environment in which the researcher can analyse sensitive data, but the data provider retains full control of the data. This enables new research possibilities.

Sensitive data remain unused

Conducting research with sensitive data poses significant challenges across various research domains, impacting both researchers and data providers. Researchers often find themselves deterred by the complex processes required to access the necessary data. Simultaneously, data providers frequently lack the requisite infrastructure or technical expertise to offer controlled and secure environments for data access. Consequently, there is a notable absence of a collaborative platform that would enable the finding, sharing, and processing of sensitive data in an efficient and secure manner.

SANE Secure Analysis Environment

In order to tackle this challenge, we developed a secure and controlled environment to allow researchers to work with sensitive data from various data providers: Secure ANalysis Environment (SANE).

SANE makes use of ISO-270001 certified SURF services and has undergone thorough penetration tests to guarantee data providers a high level of safety. SANE is currently available on SURF Research Cloud, which uses SRAM to form collaborations between data providers and researchers.

There are two variants of SANE currently available: Tinker and Blind.

  • With Tinker SANE the researcher is presented with a virtual desktop to see and manipulate the data. It is however not possible to transfer the data outside the Tinker environment. The software that is available within Tinker SANE is managed by the data provider.

     
  • With Blind SANE the researcher can submit a non-interactive analysis that will be executed using the sensitive data in a controlled environment, and therefore the researcher cannot see the data. The analysis can be either a script or a Docker container.

Webinar: Introduction to SANE

We introduced SANE in a webinar, with a demonstration of the online environment, a showcase of a successful pilot project and a discussion on how SANE can help your research projects. 

Watch the webinar

Benefits for the researcher

One of the main advantages of SANE for researchers is the simplification of collaborating with data providers and getting easy access to sensitive data. SURF Research Cloud is already widely used within the Dutch research community and SANE offers the same ease of use.

Another advantage is the uniform way of working with sensitive data. Once familiarized with SANE, a researcher will be able to work with any other data provider in the same way in the future. Additionally, as a researcher you can in most cases make use of a grant to fund the computational resources without additional costs for you or your research group. For information on funding check the Small Compute applications (NWO) page. 

Benefits for the data provider

SANE allows the data provider to maintain complete control while still allowing the researcher to study the data in a convenient manner. Researchers can analyse the data within the SANE environment, after the data provider has granted access.

Results of the analyses can only be exported to the researcher, outside the SANE environment, after verification by the data provider. The data provider can even prevent the researcher from seeing the data with Tinker Blind. Moreover, SANE removes the need for in-house expertise and resources needed to set up a similar infrastructure. National grants secured by researchers cover the cost of resources needed for SANE.

In our knowledge base you find the instructions for data providers and researchers.

SANE user manual - > How do I work with sensitive data

Roadmap

We are continuously working on improving SANE and adding more features. The following items are currently on the roadmap for SANE and are anticipated to be released by the end of 2024:

  • Linux virtual desktop for Tinker SANE
  • Integration with existing data portals to support (semi-)automatic importing of sensitive data
  • Add more options for software tools within Tinker SANE
  • Multi-factor authentication for Tinker SANE login
  • Communicty driven DTAP (Development, Testing, Acceptance and Production)

Pilot projects

We have successfully run several pilot projects in the fields of social sciences and humanities. These are two examples of pilot projects:

The ‘FIRMBACKBONE’ project an initiative of Utrecht University (UU) and the Vrije Universiteit Amsterdam (VU Amsterdam) funded by the Platform Digital Infrastructure-Social Sciences and Humanities (PDI-SSH) for the period 2020-2025.

The ‘YouthCohort’ project YOUth is a prospective cohort study with repeated measurements at regular intervals. YOUth follows two cohorts: YOUth Baby & Child (pregnancy - 7 years)  and YOUth Child & Adolescent (8 - 16 years).

Background 

The efforts to develop SANE started with the increasing demand for such an environment from the social sciences and humanities domains. The project started in 2022 for a 3-year period and was funded by a grant from PDI-SSH (Platform Digital Infrastructure Social Sciences & Humanities).

Collaborating partners

SANE is being developed by SURF and the following partners: