Published on Tue Jul 09 2019

Characterizing Inter-Layer Functional Mappings of Deep Learning Models

Donald Waagen, Katie Rainey, Jamie Gantert, David Gray, Megan King, M. Shane Thompson, Jonathan Barton, Will Waldron, Samantha Livingston, Don Hulsey

Deep learning architectures have demonstrated state-of-the-art performance for object classification and have become ubiquitous in commercial products. These methods are often applied without understanding (a) the difficulty of a horrifying task given the input data, and (b) how a specific deep learning architectures transforms that data.

0
0
0
Abstract

Deep learning architectures have demonstrated state-of-the-art performance for object classification and have become ubiquitous in commercial products. These methods are often applied without understanding (a) the difficulty of a classification task given the input data, and (b) how a specific deep learning architecture transforms that data. To answer (a) and (b), we illustrate the utility of a multivariate nonparametric estimator of class separation, the Henze-Penrose (HP) statistic, in the original as well as layer-induced representations. Given an -class problem, our contribution defines the combinations of HP statistics as a sample from a distribution of class-pair separations. This allows us to characterize the distributional change to class separation induced at each layer of the model. Fisher permutation tests are used to detect statistically significant changes within a model. By comparing the HP statistic distributions between layers, one can statistically characterize: layer adaptation during training, the contribution of each layer to the classification task, and the presence or absence of consistency between training and validation data. This is demonstrated for a simple deep neural network using CIFAR10 with random-labels, CIFAR10, and MNIST datasets.

Fri Jul 17 2020
Machine Learning
Adaptive Hierarchical Decomposition of Large Deep Networks
Deep learning has recently demonstrated its ability to rival the human brain for visual object recognition. This paper introduces a framework that automatically analyzes and configures a family of smaller deep networks. The resulting smaller networks are highly scalable and more practical to train.
0
0
0
Sun Jan 14 2018
Machine Learning
Fix your classifier: the marginal value of training the last weight layer
Neural networks are commonly used as models for classification for a wide variety of tasks. We argue that this classifier can be fixed, up to a global scale constant, with little or no loss of accuracy for most tasks.
0
0
0
Thu Nov 06 2014
Neural Networks
How transferable are features in deep neural networks?
Deep neural networks trained on natural images learn features similar to Gabor Filters and color blobs. Features must eventually transition from general to Specific by the last layer of the network, but this transition has not been studied extensively.
0
0
0
Mon Dec 16 2013
Neural Networks
Network In Network
The conventional convolutional layer uses linear filters followed by a nonlinear activation function to scan the input. Instead, we build micro neural networks with more complex structures to abstract the data within the receptive field. With enhanced local modeling via the micro network, we are able to utilize global average
2
0
0
Wed May 23 2018
Computer Vision
Do Better ImageNet Models Transfer Better?
The relationship between architecture and transfer learning is a cornerstone of computer vision. We find that, when networks are used as fixed feature extractors or fine-tuned, there is a strong correlation between ImageNet accuracy and transfer accuracy. On two small fine-grained image datasets, pretraining
3
0
8
Mon Dec 30 2019
Machine Learning
Disentangling Trainability and Generalization in Deep Neural Networks
A longstanding goal in the theory of deep learning is to characterize the conditions under which a given neural network architecture will be trainable. We identify large regions of hyperparameter space for which networks can memorize the training set but completely fail to generalize.
0
0
0