The challenge of finding the best solution
Solutions for data storage, analysis and visualization are continously evolving. For most research projects, finding the right data analysis infrastructure is as ‘simple’ as requesting a large enough virtual machine in the SURF datacentre. However, other research projects might be more complicated, need a specific framework, and require an entire cluster instead of a single virtual machine.
For example, an IoT project where sensory devices need to be connected to a platform for storing, analysing and visualizing the data. The number of devices can vary from only a few to hunderds of devices, spread across a university campus, factory floor, or an entire city. This data platform can be hosted in the public cloud (e.g. AWS, Azure), at SURF, or both.
How do you decide what solution, framework or cloud is best suited for your research project? We can help you answer this question and develop the custom cloud solution your project needs.
- We assist in finding a suitable solution to analyse, store and visualise your data.
- Our platform will be scalable, so if your dataset continues to grow and requires more infrastructure, the solution can easily scale to meet these new requirements.
- We offer infrastructure (compute, storage) as well as consultancy and optional customization.
- Our approach is ‘cloud-native’, which allows for maximum portability and can be deployed on SURF’s own private cloud, public clouds (e.g. AWS), or both.
- We apply a hybrid approach: the solution may involve a combination of public clouds and SURF’s own cloud infrastructure.
- We can create tailormade solutions from existing services if needed.
- We can create an elastic solution that automatically scales with your compute need. This can save money.
- We prefer to do all this in co-creation with you, so you can take ownership of the service after the development has been completed.
An overview of typical projects where Custom Cloud Solutions can help:
“How do you connect various kinds of sensors located in a living lab, and make the collected data available for sharing, analysis and visualization?”
The Green Village provides a living lab environment where universities and businesses can develop, test and demonstrate their innovations without concerns like privacy, with close involvement of the public and government. In this project we delivered a scalable data streaming and data sharing platform. In this living lab environment sensor data of various kinds are gathered for innovation and research. Every day, new sensors can be connected to the platform and the data is stored safely on the platform for analysis. On the platform, data from different types of sensors can easily be combined, analysed and visualized.
Read more in the article Living lab for a sustainable world
Mijn Omgeving (Sensemakers)
“How do you measure and analyse the water quality in 2 major cities?”
In the ‘Mijn Omgeving’ citizen science initiative, 40 sensors were placed in the water in the Rotterdam and Amsterdam area, to measure water contamination. We provided an endpoint where all the water sensors could send and store the data. In addition, we provided a dashboard environment where the data from these sensors could be visualized and analysed.
“How do you analyse billions of Dutch tweets to find out who tweeted what, where and when?”
In the TwiXL project we designed a solution where Dutch tweets are collected and made available to researchers and students for large-scale analysis. On the TwiXL platform you can look up words and find out where, when and how often they are used, by whom and with what other words they frequently occur together. Over the years the data size of this Twitter archive has grown considerably to several terabytes. The analysis infrastructure scales automatically with it to keep the performance level of the analysis constant.
“How do you collect website text data from 30.000 start-up companies, analyse the data and identify which of these companies are developing products or services to limit CO2 emissions?”
In the Crunchbase project we designed a cost-efficient and scalable solution to collect website text data from 30.000 start-up companies. During the collection of the data the text data was automatically filtered and transformed, which reduced the data size and helped to speed up the analysis of the researcher. For the data analysis, compute infrastructure was deployed in an automated fashion, which enables easy rescaling if more compute power is needed. It also enables future reuse if more analysis is required for the Crunchbase project or a similar project that requires similar infrastructure.