Machine learning with Apache Spark
Date: 13 FEB 2018
Apache Spark is one of the most popular computing frameworks for large-scale data processing. It also includes a machine learning library (MLlib) with distributed versions of many machine learning algorithms.
- 13 Feb 2018
- SURFsara, Amsterdam
- Prior knowledge needed?
In this workshop we give an introduction to Apache Spark and explain how to use it for distributed machine learning. For the hands-on we will be using PySpark, Sparks Python API, from a Jupyter notebook environment.
Please bring your own laptop (with an ssh client installed) for the hands-on sessions!
- Experience with the Python programming language
- Basic knowledge of supervised machine learning methods
The training is organized by SURFsara as a PRACE Training Centre.