Published on Mon Oct 16 2017

Generalization in Deep Learning

Kenji Kawaguchi, Leslie Pack Kaelbling, Yoshua Bengio

This paper provides theoretical insights into why and how deep learning can generalize well. We also discuss approaches to provide non-vacuous generalization guarantees for deep learning.

0
0
0
Abstract

This paper provides theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, responding to an open question in the literature. We also discuss approaches to provide non-vacuous generalization guarantees for deep learning. Based on theoretical observations, we propose new open problems and discuss the limitations of our results.

Mon Dec 07 2020
Machine Learning
Generalization bounds for deep learning
Generalization in deep learning has been the topic of much recent theoretical and empirical research. Here we introduce desiderata for techniques to predict generalization errors for deep learning models in supervised learning. Such predictions should 1) scale correctly with data complexity, 2) scale correctly with training set size, and 3) capture differences between architectures.
3
0
1
Tue Jun 27 2017
Machine Learning
Exploring Generalization in Deep Learning
We study what drives generalization in deep networks. We consider several recently suggested explanations, including norm-based control,sharpness and robustness. We then investigate how the measures explain different observed phenomena.
0
0
0
Wed Feb 13 2019
Machine Learning
Uniform convergence may be unable to explain generalization in deep learning
Recent works have developed a variety of generalization bounds for deep learning. In practice, these bounds can increase with the training dataset size. We show that applying (two-sided) uniform convergence on this set of classifiers will yield only a vacuous generalization guarantee larger than $1.
0
0
0
Wed Dec 04 2019
Machine Learning
Fantastic Generalization Measures and Where to Find Them
Generalization of deep networks has been of great interest in recent years. Most papers proposing such measures study only a small set of models. We present the first large scale study of generalization in deep networks.
1
0
1
Thu Oct 22 2020
Machine Learning
In Search of Robust Measures of Generalization
One of the principal scientific challenges in deep learning is explaining generalization. A large volume of work aims to close this gap, primarily by developing bounds on generalization error, optimization error, and excess risk. Most of these bounds are numerically difficult to evaluate empirically.
3
0
9
Thu Sep 27 2018
Machine Learning
An analytic theory of generalization dynamics and transfer learning in deep linear networks
Large, deep networks can generalize well, but existing theories are exceedingly loose. We develop an analytic theory of the nonlinear dynamics of generalization in deep linear networks. In particular, our theory provides analytic solutions to the training and testing of deep networks.
0
0
0