A Practical Federated Learning Framework for Small Numbers of Stakeholders

Duration

18 months

Status

Completed

MTC Team

Dr. Saikishore Kalloori, Christian Schneebeli, Metod Jazbec, Yan Girsberger, Roy Schubiger, Dr. Severin Klingler, Prof. Markus Gross

Collaborators

Dr. Cristina Kadar (NZZ), Yannick Suter (TX Group), Marcel Blattner (TX Group)

Today, media houses rely more and more on machine learning models to better personalize and target their content towards their audience. While users can for example greatly benefit from improved content recommendations, data privacy of user data should always be guaranteed. Therefore, we work on privacy-preserving machine learning techniques with a focus on technologies that are relevant for the media industry.

Machine learning is driven by the amount and quality of the available data. While today, data is abundant, it is typically stored in data silos and cannot be shared between companies. Federated learning has the potential to allow companies to collaboratively build machine learning models without the need to share data at any point. In this project, we developed a prototype of a federated learning system for our media partners to create better article recommendations and user understanding algorithms.

A model trained on many different data sets should be more robust and provide better accuracy for tasks such as predicting user conversion or providing article recommendations. This project explored how to leverage federated machine learning for building shared models for the media industry. For this, we developed a prototype system that allows the MTC and our partners to train machine learning models collaboratively. The framework is available for free as an open-source project for commercial and non-commercial use. The system has been deployed and tested with two of our partners TX Group and NZZ.

Goals

Design a practical prototype system for federated learning.
Build machine learning models to predict users subscription (propensity) on news media websites.
Implementation of different privacy ensuring protocols in order to provide security and protection for users data.
Quantify and track the contribution of different data sets to the performance of the final model.

Outcomes

Developed a research prototype federated learning framework for a small number of stakeholders which is easy to set up and maintain for the stakeholders. The framework is successfully running on the infrastructure of NZZ and TX Group.
Conducted experiments on three different machine learning problems - binary classification, multi-label predictions and ranking problems.
Performance of federated models improved by up to 20% compared to models trained on data from a single company.
The framework has been external page released as open-source and is available for commercial and non-commercial usage.

Our findings show that it is technically feasible to set up a federated learning framework that allows for secure and private model building jointly with dierent stakeholders. Federated model performance compared to individual model performance improved by up to 20% in our experiments. Depending on the use case, 20% might provide a significant return on investment and make a federated learning setup an appealing choice.

Publications

Public Open-Source Repositories

A Simple Federated Learning Framework for Small Number of Stakeholders

Our framework allows a small number of stakeholders to train various machine learning models in a federated way. external page Find the repository on Github

Federated Neural Collaborative Filtering

A federated learning approach to neural collaborative filtering of news articles. external page Find the repository on Github

Additional Project Resources

Resources for Industry Partners

Additional project resources for our industry partners are only available to registered users and can be found here.