Published on Thu May 10 2018

Exploiting Unintended Feature Leakage in Collaborative Learning

Luca Melis, Congzheng Song, Emiliano De Cristofaro, Vitaly Shmatikov

Collaborative machine learning and related techniques such as federated learning allow multiple participants, each with his own training dataset, to build a joint model. We demonstrate that these updates leak unintended information about participants' training data. We develop passive and active inference attacks to exploit this leakage.

1
0
0
Abstract

Collaborative machine learning and related techniques such as federated learning allow multiple participants, each with his own training dataset, to build a joint model by training locally and periodically exchanging model updates. We demonstrate that these updates leak unintended information about participants' training data and develop passive and active inference attacks to exploit this leakage. First, we show that an adversarial participant can infer the presence of exact data points -- for example, specific locations -- in others' training data (i.e., membership inference). Then, we show how this adversary can infer properties that hold only for a subset of the training data and are independent of the properties that the joint model aims to capture. For example, he can infer when a specific person first appears in the photos used to train a binary gender classifier. We evaluate our attacks on a variety of tasks, datasets, and learning configurations, analyze their limitations, and discuss possible defenses.

Tue Oct 27 2020
Machine Learning
FaceLeaks: Inference Attacks against Transfer Learning Models via Black-box Queries
Transfer learning is a useful machine learning framework that allows one to build task-specific models. The teacher model may contain private data, or interact with private inputs. We investigate if one can leak or infer such private information without interacting with the teacher model directly.
0
0
0
Fri Nov 23 2018
Machine Learning
Dancing in the Dark: Private Multi-Party Machine Learning in an Untrusted Setting
TorMentor is an anonymous hidden service that allows data sources to contribute towards a globally-shared model. It provides provable privacy guarantees in an untrusted setting. We show that it protects data providers against ML attacks.
0
0
0
Wed Jan 06 2021
Machine Learning
IPLS : A Framework for Decentralized Federated Learning
IPLS is a fully decentralized learning framework partially based on the interplanetary file system (IPFS) By using IPLS and connecting into the corresponding private IPFS network, any party can initiate the training process of an ML model or join an ongoing training process.
0
0
0
Mon Jul 13 2020
Machine Learning
Quality Inference in Federated Learning with Secure Aggregation
Federated learning is the most popular of these methods. Despite individual data is not shared explicitly, recent studies showed that federated learning models could still leak information. In this paper we focus on the quality of individual training datasets, and show that such information could be inferred.
0
0
0
Fri Dec 04 2020
Machine Learning
Unleashing the Tiger: Inference Attacks on Split Learning
A malicious server can actively hijack the learning process of the distributed model and bring it into an insecure state. We demonstrate that our attack is able to overcome recently proposed defensive techniques aimed at enhancing the security of the Split Learning protocol.
0
0
0
Mon Oct 14 2019
Machine Learning
Eavesdrop the Composition Proportion of Training Labels in Federated Learning
Federated learning (FL) has recently emerged as a new form of collaborative machine learning. The first type of attack, Class Sniffing, can detect whether a certain label appears in training. The other two attacks can determine the quantity of each label.
0
0
0
Thu Feb 22 2018
Machine Learning
The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks
This paper describes a testing methodology for quantitatively assessing the risk that rare or unique training-data sequences are unintentionally memorized by machine-learning models. Such models are sometimes trained on sensitive data (e.g., the text of users' private messages)
2
1
33
Thu Sep 04 2014
Computer Vision
Very Deep Convolutional Networks for Large-Scale Image Recognition
Convolutional networks of increasing depth can achieve state-of-the-art results. The research was the basis of the team's ImageNet Challenge 2014.
2
2
7
Wed Feb 17 2016
Machine Learning
Communication-Efficient Learning of Deep Networks from Decentralized Data
Modern mobile devices have access to a wealth of data suitable for learning models. This rich data is often privacy sensitive, large in quantity, or both. We present a practical method for the federated learning of deep networks.
3
0
4
Fri Jul 01 2016
Machine Learning
Deep Learning with Differential Privacy
Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may contain sensitive information. The models should not expose private information in these datasets.
4
0
4
Sat Feb 24 2018
Machine Learning
Scalable Private Learning with PATE
Private Aggregation of Teacher Ensembles, or PATE, transfers to a "student" model the knowledge of an ensemble of "teacher" models. PATE offers intuitive privacy provided by training teachers on disjoint data and strong privacy guaranteed by noisy aggregation of teachers' answers.
2
0
2
Tue Oct 18 2016
Machine Learning
Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data
Private Aggregation of Teacher Ensembles (PATE) combines multiple models trained with disjoint datasets. The student learns to predict an output chosen by noisy voting among all of the teachers. It applies to any model, including non-convex models like DNNs.
2
2
2
Tue May 05 2020
Machine Learning
Stealing Links from Graph Neural Networks
Graph data, such as chemical networks and social networks, may be deemed confidential/private because the data owner often spends lots of resources collecting the data or the data contains sensitive information.
1
19
51
Tue May 25 2021
Machine Learning
Honest-but-Curious Nets: Sensitive Attributes of Private Inputs can be Secretly Coded into the Entropy of Classifiers' Outputs
Deep classifiers can be trained to secretly encode a sensitive attribute of users' input data, at inference time. An attack that works even if users have a white-box view of the classifier.
2
5
45
Sun Jul 04 2021
Machine Learning
Survey: Leakage and Privacy at Inference Time
Leakage of data from publicly available Machine Learning (ML) models is an area of growing significance. Commercial and government applications of ML can draw on multiple sources of data, potentially including users' and clients' sensitive data. We provide a comprehensive survey of contemporary advances on several fronts.
2
10
23
Fri Feb 26 2021
Machine Learning
FjORD: Fair and Accurate Federated Learning under heterogeneous targets with Ordered Dropout
Federated Learning (FL) has been gaining significant traction across different ML tasks, ranging from vision to keyboard predictions. FjORD alleviates the problem of client system heterogeneity by tailoring the model width to the client's capabilities.
2
5
22
Fri Jun 21 2019
Machine Learning
Deep Leakage from Gradients
Exchanging gradients is a widely used method in modern multi-node machine learning system. We show that it is possible to obtain private training data from the publicly shared gradients. We name this leakage as Deep Leakage from Gradient.
1
1
12
Wed Mar 10 2021
Computer Vision
A Study of Face Obfuscation in ImageNet
Face obfuscation (blurring, mosaicing, etc.) has been shown to be effective for privacy protection. However, object recognition research typicallyassumes access to complete, unobfuscated images. In this paper, we explore the effects of face obfuscation on the popular ImageNet challenge.
2
2
9