Published on Thu Jul 01 2021

End-to-end Compression Towards Machine Vision: Network Architecture Design and Optimization

Shurun Wang, Zhao Wang, Shiqi Wang, Yan Ye

The research of visual signal compression has a long history. We propose an inverted bottleneck structure for end-to-end compression towards machine vision. The proposed scheme achieves significant BD-rate savings in terms of performance.

0
0
0
Abstract

The research of visual signal compression has a long history. Fueled by deep learning, exciting progress has been made recently. Despite achieving better compression performance, existing end-to-end compression algorithms are still designed towards better signal quality in terms of rate-distortion optimization. In this paper, we show that the design and optimization of network architecture could be further improved for compression towards machine vision. We propose an inverted bottleneck structure for end-to-end compression towards machine vision, which specifically accounts for efficient representation of the semantic information. Moreover, we quest the capability of optimization by incorporating the analytics accuracy into the optimization process, and the optimality is further explored with generalized rate-accuracy optimization in an iterative manner. We use object detection as a showcase for end-to-end compression towards machine vision, and extensive experiments show that the proposed scheme achieves significant BD-rate savings in terms of analysis performance. Moreover, the promise of the scheme is also demonstrated with strong generalization capability towards other machine vision tasks, due to the enabling of signal-level reconstruction.

Fri May 28 2021
Computer Vision
What Is Considered Complete for Visual Recognition?
We hope our proposal can inspire the community to pursue the compression-recovery tradeoff rather than the accuracy-complexity tradeoff. Semantic annotations, when available, play the role of weak supervision.
2
23
77
Sat Mar 06 2021
Artificial Intelligence
End-to-end optimized image compression for multiple machine tasks
0
0
0
Thu Jul 18 2019
Computer Vision
Deep Perceptual Compression
Deep learned lossy compression techniques have been proposed in the recent literature. Most of these are optimized by using either MS-SSIM or MSE as a loss function. Neither of these correlate well with human perception. We propose the use of an alternative, deep perceptual metric.
0
0
0
Thu Aug 19 2021
Computer Vision
An Information Theory-inspired Strategy for Automatic Network Pruning
Deep convolution networks are well known to be compressed on devices that have resource constraints. Most existing network pruning methods require laborious human efforts and prohibitive computation resources. The principle behind our method is the information bottleneck.
0
0
0
Tue Nov 17 2020
Machine Learning
Analyzing and Mitigating Compression Defects in Deep Learning
With the proliferation of deep learning methods, many computer vision problems which were considered academic are now viable in the consumer setting. Despite this, there has been little study of the effect of JPEG compression on deep neural networks.
2
4
5
Wed Aug 02 2017
Computer Vision
An End-to-End Compression Framework Based on Convolutional Neural Networks
Two CNNs are seamlessly integrated into an end-to-end compression framework. The first CNN learns an optimal compact representation from an input image. The second CNN is used to reconstruct the decoded image with high-quality in the decoding end. The proposed compression framework is compatible with existing image coding standards.
0
0
0
Thu Sep 04 2014
Computer Vision
Very Deep Convolutional Networks for Large-Scale Image Recognition
Convolutional networks of increasing depth can achieve state-of-the-art results. The research was the basis of the team's ImageNet Challenge 2014.
3
3
7
Mon Dec 22 2014
Machine Learning
Adam: A Method for Stochastic Optimization
Adam is an algorithm for first-order gradient-based optimization of stochastic objective functions. The method is straightforward to implement and has little memory requirements. It is well suited for problems that are large in terms of data and parameters.
3
0
2
Thu Jun 04 2015
Computer Vision
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
State-of-the-art object detection networks depend on region proposalgorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks.
0
0
0
Thu Jan 09 2020
Computer Vision
An Emerging Coding Paradigm VCM: A Scalable Coding Approach Beyond Feature and Signal
Video Coding for Machine (VCM) aims to bridge the gap between visual feature compression and classical video coding. VCM is committed to address the requirement of compact signal representation for both machine and human vision.
0
0
0
Thu May 01 2014
Computer Vision
Microsoft COCO: Common Objects in Context
The dataset contains photos of 91 objects types that would be easily recognizable by a 4 year old. With a total of 2.5 million labeled instances in 328k images, the creation of our dataset drew upon extensive crowd worker involvement.
1
0
0
Thu Nov 19 2015
Machine Learning
Density Modeling of Images using a Generalized Normalization Transformation
We introduce a parametric nonlinear transformation that is well-suited for Gaussianizing data from natural images. We show that samples of this model are visually similar to natural image patches. We demonstrate the use of the model as a prior probability density that can be used to remove additive noise.
0
0
0