Published on Thu Feb 15 2018

Image Transformer

Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Łukasz Kaiser, Noam Shazeer, Alexander Ku, Dustin Tran

Image generation has been successfully cast as an autoregressive sequence generation or transformation problem. By restricting the self-attention mechanism to attend to local neighborhoods we significantly increase the size of images the model can process in practice.

0
0
0
Abstract

Image generation has been successfully cast as an autoregressive sequence generation or transformation problem. Recent work has shown that self-attention is an effective way of modeling textual sequences. In this work, we generalize a recently proposed model architecture based on self-attention, the Transformer, to a sequence modeling formulation of image generation with a tractable likelihood. By restricting the self-attention mechanism to attend to local neighborhoods we significantly increase the size of images the model can process in practice, despite maintaining significantly larger receptive fields per layer than typical convolutional neural networks. While conceptually simple, our generative models significantly outperform the current state of the art in image generation on ImageNet, improving the best published negative log-likelihood on ImageNet from 3.83 to 3.77. We also present results on image super-resolution with a large magnification ratio, applying an encoder-decoder configuration of our architecture. In a human evaluation study, we find that images generated by our super-resolution model fool human observers three times more often than the previous state of the art.

Mon Feb 16 2015
Neural Networks
DRAW: A Recurrent Neural Network For Image Generation
This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation. DRAW networks combine a novelatial attention mechanism that mimics the foveation of the human eye.
0
0
0
Mon Nov 09 2015
Machine Learning
Generating Images from Captions with Attention
We introduce a model that generates images from natural language descriptions. After training on Microsoft COCO, we compare our model with several baseline generative models. We demonstrate that our model produces higher quality samples than other approaches.
0
0
0
Sat Jun 16 2018
Computer Vision
The Neural Painter: Multi-Turn Image Generation
In this work we combine two research threads from Vision/ Graphics and Natural Language Processing to formulate an image generation task. We introduce a framework that includes a novel training algorithm as well as model improvements built for the multi-turn setting. We demonstrate that this framework generates a sequence of images that match
0
0
0
Mon Jul 09 2018
Machine Learning
Pioneer Networks: Progressively Growing Generative Autoencoder
The ability to reconstruct input images is crucial in many real-world applications. Previous GANs have not delivered satisfactory high-quality results. We show promising results in image synthesis and inference.
0
0
0
Fri Nov 27 2020
Computer Vision
Image Generators with Conditionally-Independent Pixel Synthesis
Existing image generator networks rely heavily on spatial convolutions and self-attention blocks in order to gradually synthesize images. We present a new architecture for image generators where the color value at each pixel is computed independently.
2
1
2
Mon May 21 2018
Machine Learning
Self-Attention Generative Adversarial Networks
SAGAN allows attention-driven, long-range dependency modeling for image generation tasks. Traditional convolutional GANs generate high-resolution details as a function of only spatially local points.
4
1
2