Case study: Easy access to processing power for computational recommendation models

Together with a group of master's students, university lecturer and researcher Flavius Frasincar from Erasmus University Rotterdam (EUR) is researching how recommendation systems for online financial news can be improved. Systems of this type automatically recommend news items to readers based on their reading profiles.

Docent geeft uitleg aan leerlingen achter een laptop

Complex models

“Earlier models were pretty simple,” explains Flavius Frasincar. "They only looked at the words present in a news item and made recommendations to the reader on that basis. Now we are also including the meaning behind words, i.e. homonyms, synonyms and associations – all possible relationships between words." He gives the example of the ECB and Draghi: names that don't feature in a dictionary but which nevertheless belong to each other.

Computations not feasible on a PC

"All these relationships between words mean that we have a great many parameters for our computations. We also include various threshold values which determine whether or not the system recommends a news item." This necessitates a huge number of computations. Attempts to perform these computations on a PC failed. "After a couple of weeks the computer still hadn't finished. One of my students tried this but it quickly became clear that we needed more processing power.” It wasn't so much the size of the dataset, which at around 100 news items wasn't particularly big. Computing more than thirty parameters for each threshold value in combination with the processing of natural language made this task computationally difficult, explains Frasincar.

Fast access to processing power

SURF's compute cluster soon came to mind. "I was already using the Lisa cluster through an NWO project. But halfway through the year, our research group had already used up 98% of the allotted computing hours. So it's great that the EUR is now also purchasing computing time from SURF. If I need extra computing hours this can be arranged quickly and easily. All I have to do is email EUR's Research Support Office.” 

"So it's great that the EUR is now also purchasing computing time from SURF. If I need extra computing hours this can be arranged quickly and easily."

Getting started

"An introductory course in Cluster Computing was essential to enable my students to get started with the compute cluster,” says Frasincar. "My students were familiar with Java, but they didn't know anything about Linux, the operating system that the compute cluster runs on. We also took full advantage of the support offered by SURF during the research project. That's when we noticed that we had quite a few specific questions. For example, we needed technical expertise to configure a new version of a Java virtual machine on the cluster. SURF gave us a great deal of help with this, and everything was sorted really quickly."