Published on Sun Jan 17 2021

Estimating informativeness of samples with Smooth Unique Information

Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini, Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

We define a notion of information that an individual sample provides to the training of a neural network. We show that these quantities have a qualitatively different behavior. We give efficient approximations of these quantities using a linearized network.

0
0
0
Abstract

We define a notion of information that an individual sample provides to the training of a neural network, and we specialize it to measure both how much a sample informs the final weights and how much it informs the function computed by the weights. Though related, we show that these quantities have a qualitatively different behavior. We give efficient approximations of these quantities using a linearized network and demonstrate empirically that the approximation is accurate for real-world architectures, such as pre-trained ResNets. We apply these measures to several problems, such as dataset summarization, analysis of under-sampled classes, comparison of informativeness of different data sources, and detection of adversarial and corrupted examples. Our work generalizes existing frameworks but enjoys better computational properties for heavily over-parametrized models, which makes it possible to apply it to real-world networks.

Sun May 11 2014
Artificial Intelligence
Learning from networked examples
Machine learning algorithms are based on the assumption that training examples are drawn independently. This assumption does not hold when learning from a networked sample. Two or more training examples may share some common objects, and hence share the features of these.
0
0
0
Tue Oct 29 2019
Machine Learning
Distribution Density, Tails, and Outliers in Machine Learning: Metrics and Applications
We evaluate five methods to score examples in a dataset by how well-represented the examples are. Despite being independent approaches, we find all five are highly correlated. We find these methods can be combined to identify prototypical examples that match human expectations.
0
0
0
Tue Jun 10 2014
Machine Learning
Generative Adversarial Networks
We propose a new framework for estimating generative models via an adversarial process. We simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D. The training procedure for G is to maximize the probability of D making a mistake.
9
3,421
23,915
Tue Apr 13 2021
Artificial Intelligence
Simpler Certified Radius Maximization by Propagating Covariances
0
0
0
Wed Mar 17 2021
Machine Learning
Understanding Generalization in Adversarial Training via the Bias-Variance Decomposition
0
0
0
Fri Jan 10 2020
Machine Learning
Choosing the Sample with Lowest Loss makes SGD Robust
The presence of outliers can potentially significantly skew the parameters of machine learning models. We propose a simple variant of the simple SGD method. In each step, first choose a set of k samples, then from these choose the one with the smallest current loss.
0
0
0