Machine learning with Apache Spark

Date: 13 FEB 2018 

Apache Spark is one of the most popular computing frameworks for large-scale data processing. It also includes a machine learning library (MLlib) with distributed versions of many machine learning algorithms.

Date
13 Feb 2018
Time
09:00-17:30
Location
SURFsara, Amsterdam
Prior knowledge needed?
Yes

In this workshop we give an introduction to Apache Spark and explain how to use it for distributed machine learning. For the hands-on we will be using PySpark, Sparks Python API, from a Jupyter notebook environment.

Please bring your own laptop (with an ssh client installed) for the hands-on sessions!

Requirements:

  • Experience with the Python programming language
  • Basic knowledge of supervised machine learning methods

The training is organized by SURFsara as a PRACE Training Centre.

Latest modifications 20 Dec 2017