Published on Tue Sep 04 2018

Towards Understanding Regularization in Batch Normalization

Ping Luo, Xinjiang Wang, Wenqi Shao, Zhanglin Peng

Batch Normalization (BN) improves both convergence and generalization in neural networks. This work understands these phenomena theoretically.

0
0
0
Abstract

Batch Normalization (BN) improves both convergence and generalization in training neural networks. This work understands these phenomena theoretically. We analyze BN by using a basic block of neural networks, consisting of a kernel layer, a BN layer, and a nonlinear activation function. This basic network helps us understand the impacts of BN in three aspects. First, by viewing BN as an implicit regularizer, BN can be decomposed into population normalization (PN) and gamma decay as an explicit regularization. Second, learning dynamics of BN and the regularization show that training converged with large maximum and effective learning rate. Third, generalization of BN is explored by using statistical mechanics. Experiments demonstrate that BN in convolutional neural networks share the same traits of regularization as the above analyses.

Fri Jun 01 2018
Artificial Intelligence
Understanding Batch Normalization
Batch normalization (BN) is a technique to normalize activations in intermediate layers of deep neural networks. Its tendency to improve accuracy and speed up training have established BN as a favorite technique in deep learning.
0
0
0
Wed Sep 27 2017
Machine Learning
Riemannian approach to batch normalization
The space of weight vectors in the BN layer can be naturally interpreted as a Riemannian manifold, which is invariant to linear scaling of weights. Following the intrinsic geometry of this manifold provides a new learning rule that is more efficient and easier to analyze.
0
0
0
Tue May 29 2018
Machine Learning
How Does Batch Normalization Help Optimization?
Batch Normalization (BatchNorm) is a widely adopted technique that enables faster and more stable training of deep neural networks. The exact reasons for BatchNorm's effectiveness are still poorly understood.
2
0
9
Tue Aug 18 2020
Machine Learning
Training Deep Neural Networks Without Batch Normalization
Batch normalization was developed to combat covariate shift inside networks. Empirically it is known to work, but there is a lack of theoretical understanding about its effectiveness. This work studies batch normalization in detail, while comparing it with other methods.
0
0
0
Wed Mar 06 2019
Machine Learning
Mean-field Analysis of Batch Normalization
Batch Normalization (BatchNorm) is an extremely useful component of modern neuroscience. It enables optimization using higher learning rates and achieving faster convergence. We show that it has a flattening effect on the loss landscape.
0
0
0
Tue Mar 02 2021
Machine Learning
Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization
Batch Normalization (BN) is a commonly used technique to accelerate and stabilize training of deep neural networks. We introduce an analyticframework based on convex duality to obtain exact convex representations of ReLU networks with BN.
1
0
1