Portfolio Data Services
Secure, long-term storage of research data on tape
Complete array of services
SURFsara offers researchers the following data-related services:
- SURFdrive is an easy-to-use online storage facility for research universities and universities of applied sciences. This service is intended for storing and exchanging office data and, to a limited extent, research data, as well as other types of data.
- BeeHub is intended for storing and exchanging scientific data that has not yet been processed or analysed. This may involve extremely large data volumes.
- B2SAFE/Data replication makes it possible to save data to multiple locations in a secure fashion. In order to permanently be able to determine the location of data, persistent identifiers are used.
- The SURFsara Data Archive is the central location for the archiving and storage of data, including long-term. The archive offers quick access to the SURFsara computing facilities.
- Data Persistent Identifier using PIDs ensures the findability of your data, now and always. SURFsara offers this service in cooperation with the European Persistent Identifier Consortium (EPIC).
- Data Ingest is a service designed to transfer data from external hard disks to one of the SURFsara storage systems.
Knowing which service to use and when
Many researchers wonder which service is best suited to their data. There is no single answer to this question. Our advisers always provide a customised recommendation in order to find the best possible solution for you. We can of course list a number of factors considered in making our recommendation.
2 of the services feature especially prominently in this context:
- Data Persistent Identifier ensures your data is findable and referable through the use of persistent identifiers (PIDs). This service comes standard with B2SAFE. Even if your data is stored on another infrastructure, you can still use the PIDs.
- Data Ingest is a service designed to transfer data from external hard disks to one of the SURFsara storage systems. You can then store the data on one of the specified infrastructures.
In all other cases, the following factors apply:
1. How much data do you need to store?
Not every service offers unlimited storage space. The SURFdrive service, for example, is primarily suitable for office data and smaller volumes of research data. Usage of this service is restricted to a maximum of 100 gigabyte. The other services, BeeHub and the Data Archive in particular, offer the potential to store extremely large data sets. These services' storage capability expands to meet your needs.
2. What do you intend to do with the data?
This is really the most important question. If you intend to actively process or analyse the data in question, the BeeHub and SURFdrive services are the most obvious choices. These services are also suitable for sharing data. For archiving and permanent storage of data, the Data Archive service is the best solution. For reference, a schematic of the data life cycle is provided below. SURFsara coordinates the tasks of the storage and computing services.
B2SAFE is also suitable for research projects in which researchers want to exchange data. Additionally, B2SAFE makes it possible to save data to multiple locations in a secure fashion. The use of PIDs, as described above, is standard for B2SAFE as well.
3. What kind of availability will the data have?
When convenient access is a priority, SURFdrive is the ideal environment for storing and exchanging files securely. Employees of SURF member institutions can simply log in using SURFconext. BeeHub is easy to access as well. For manipulating data (saving, deleting, moving to another folder), SURFdrive and BeeHub offer a user-friendly web interface.
The Data Archive is Internet-accessible. This archive supports diverse protocols for data transfer including (HPN)SCP, SFTP, rsync and GridFTP. These protocols can operate in both Linux and Windows environments.
Data life cycle
When choosing a specific storage infrastructure, the type and quantity of data are not the only concerns. The research phase corresponding to the data is at least as important. The following schematic illustrates this point:
This schematic shows the different stages typically undergone by scientific data. Data is generated by scientific instruments and subsequently processed and analysed in one of the computing infrastructures. The data is then archived or stored and shared with other researchers. Afterwards the data can be reused and the entire cycle may start again.
This schematic also indicates which storage infrastructures can be implemented in which stage:
- BeeHub and SURFdrive are infrastructures for data that is in active use. This may be the case in various stages: during processing, analysis and exchange of data.
- In the data retention phase the central archive is the most important infrastructure. The data replication B2SAFE is also a much-utilised service in this phase.
- Data Persistent Identifier (PIDs) can be applied in a variety of stages.
SURFsara maintains close collaborative ties with organisations in the Netherlands and Europe in the data management field. This collaboration is aimed at the development of new services and the exchange of expertise. SURFsara cooperates with Research Data Netherlands (RDNL), among others, in the area of data archiving. Together with EUDAT, a European collaboration, SURFsara is working towards solutions for the storage and management of research data at a European level. Lastly, SURFsara offers services relating to data identification in concert with the European Persistent Identifier Consortium (EPIC). Read more information about SURFsara's collaborations.
Advice and consultancy
Our data infrastructure specialists would be happy to help you select the storage solution that best suits your research project. Feel free to contact us at firstname.lastname@example.org.
In need of independent advice on topics such as designing your data infrastructure or the optimum usage of SURFsara's computing and storage facilities? Our consultancy service will be happy to help.