Published on Thu Oct 31 2019

Inverse Graphics: Unsupervised Learning of 3D Shapes from Single Images

Talip Ucar

Using generative models for Inverse Graphics is an active area of research. Most works focus on developing models for supervised and semi-supervised methods. In this paper, we study the problem of unsupervised learning of 3D geometry from single images.

0
0
0
Abstract

Using generative models for Inverse Graphics is an active area of research. However, most works focus on developing models for supervised and semi-supervised methods. In this paper, we study the problem of unsupervised learning of 3D geometry from single images. Our approach is to use a generative model that produces 2-D images as projections of a latent 3D voxel grid, which we train either as a variational auto-encoder or using adversarial methods. Our contributions are as follows: First, we show how to recover 3D shape and pose from general datasets such as MNIST, and MNIST Fashion in good quality. Second, we compare the shapes learned using adversarial and variational methods. Adversarial approach gives denser 3D shapes. Third, we explore the idea of modelling the pose of an object as uniform distribution to recover 3D shape from a single image. Our experiment with the CelebA dataset \cite{liu2015faceattributes} proves that we can recover complete 3D shape from a single image when the object is symmetric along one, or more axis whilst results obtained using ModelNet40 \cite{wu20153d} show the potential side-effects, in which the model learns 3D shapes such that it can render the same image from any viewpoint. Forth, we present a general end-to-end approach to learning 3D shapes from single images in a completely unsupervised fashion by modelling the factors of variation such as azimuth as independent latent variables. Our method makes no assumptions about the dataset, and can work with synthetic as well as real images (i.e. unsupervised in true sense). We present our results, by training the model using the -VAE objective \cite{ucar2019bridging} and a dataset combining all images from MNIST, MNIST Fashion, CelebA and six categories of ModelNet40. The model is able to learn 3D shapes and the pose in qood quality and leverages information learned across all datasets.

Tue Jun 11 2019
Machine Learning
Inferring 3D Shapes from Image Collections using Adversarial Networks
Projective generative adversarial network (PrGAN) trains a deep generative model of 3D shapes. PrGAN uses a differentiable projection module to infer the underlying 3D shape distribution without access to any explicit 3D or viewpoint views.
0
0
0
Mon Nov 16 2020
Machine Learning
Cycle-Consistent Generative Rendering for 2D-3D Modality Translation
For humans, visual understanding is inherently generative: given a 3D shape, we can postulate how it would look in the world. In the context of computer vision, this corresponds to a learnable module. We learn such a module while being conscious of the difficulties in obtaining large paired 2D-3D datasets.
0
0
0
Tue Oct 01 2019
Computer Vision
Unsupervised Generative 3D Shape Learning from Natural Images
This is the first method to learn a generative model of 3D shapes from natural images in a fully unsupervised way. We do not use any ground truth 3D or 2Dannotations, stereo video, and ego-motion during the training.
0
0
0
Sun Dec 18 2016
Computer Vision
3D Shape Induction from 2D Views of Multiple Objects
In this paper we investigate the problem of inducing a distribution over three-dimensional structures given two-dimensional views of multiple objects. Our approach called "projective generative adversarial networks" (PrGANs) trains a deep generative model of 3D shapes.
0
0
0
Mon Nov 26 2018
Computer Vision
Unsupervised 3D Shape Learning from Image Collections in the Wild
We present a method to learn the 3D surface of objects directly from a collection of images. Previous work achieved this capability by exploiting additional manual annotation. Our method does not make use of any such annotation. Rather, it builds a generative model, a convolutional neural network, which, given a
0
0
0
Mon Mar 29 2021
Computer Vision
Learning Generative Models of Textured 3D Meshes from Real-World Images
Recent advances in differentiable rendering have sparked an interest in learning generative models of textured 3D meshes from image collections. We show that the performance of our approach is on par with prior work that relies on ground-truth keypoints. We release our code at GitHub.
0
0
0