“I need to cheaply store my dataset of over 100 TB”
Secure, long-term storage with Data Archive
The Data Archive is the centralised location for data archiving and (long-term) storage. You can securely store research data there, even in volumes running into the petabytes. The archive provides quick access to SURF's computing facilities.
Long-term data storage
The Data Archive service is designed for long-term data storage. In this way, it differs from back-up systems, which are meant for recovering data which was accidentally lost. The archive is aimed specifically at the secure storage of data that are not actively used. For instance, researchers can opt to 'freeze' data from an article, or store raw data which need to remain accessible for future research. Data allocated to the data archive is stored in a tape library at our datacenter in Amsterdam.
You can transfer data to the Data Archive online. The Data Archive supports a whole range of data transfer protocols, such as (HPN)SCP, SFTP, rsync and GridFTP. These protocols are compatible with both Linux and Windows environments. You can log into the Data Archive via SSH and manage your data via the command line.
For each project, you can indicate how long the data should be stored and who are authorised to access it. That way, it is possible for a research group or consortium to obtain access to the data. When it comes to data management, we generally adhere to the B2SAFE guidelines, though we may deviate from them in exceptional cases. We can then create a customised data policy with the aid of the iRODS data management system used by B2SAFE.
“Our own storage infrastructure will be jettisoned, can you store our 20 PB of data?”
The Data Archive is used mainly for research projects carried out by academic communities wishing to archive large volumes of data. A few examples:
- IMARES, the Dutch institute for applied marine ecological research, that is mapping fish stocks in the North Sea
- The CosmoGrid project, with a dataset of 100 terabytes (Dutch)
- Data storage for LOFAR
- Large Hadron Collider (LHC)
You can find the rates for this service in the SURF Services and Rates brochure (PDF)
Support and consultancy
Data Archive users can always count on us for support. For example, we can help you access the Data Archive and save your data to the right location. We can also advise you on making the data referable by linking it to persistent identifiers, or through the use of Data Persistent Identifier.
If you have any questions, or want to report a problem, please submit them via the servicesdesk portal, send an email to firstname.lastname@example.org, or phone +31-20-8001400. The helpdesk is available during office hours (9:00–17:00).
For more in-depth advice on matters such as the configuration of your data infrastructure, please contact one of our consultants.
If you use this service, you may also be interested in the following services:
Data Persistent Identifier
With persistent identifiers (PIDs), you can refer to data (e.g. in articles), as well as retrieve it. SURF provides the option to add PIDs to research data saved on the Data Archive. For more information, go to Data Persistent Identifier.
SURF Data Repository
SURF Data Repository enables you to make large data sets publicly available for the long term. This service is useful if you want to keep your research data available for other researchers after completion of your research.
As a Data Archive user, you can easily analyse or re-analyse your data at will. For this purpose, you can use one of SURF's wide-ranging compute services.