Published on Thu Mar 25 2021

Contrasting Contrastive Self-Supervised Representation Learning Models

Klemen Kotar, Gabriel Ilharco, Ludwig Schmidt, Kiana Ehsani, Roozbeh Mottaghi

In the past few years, we have witnessed remarkable breakthroughs in self-supervised representation learning. Despite the success and adoption of representations learned through this paradigm, much is yet to be understood about how different training methods and datasets influence performance on downstream tasks.

3
0
3
Abstract

In the past few years, we have witnessed remarkable breakthroughs in self-supervised representation learning. Despite the success and adoption of representations learned through this paradigm, much is yet to be understood about how different training methods and datasets influence performance on downstream tasks. In this paper, we analyze contrastive approaches as one of the most successful and popular variants of self-supervised representation learning. We perform this analysis from the perspective of the training algorithms, pre-training datasets and end tasks. We examine over 700 training experiments including 30 encoders, 4 pre-training datasets and 20 diverse downstream tasks. Our experiments address various questions regarding the performance of self-supervised models compared to their supervised counterparts, current benchmarks used for evaluation, and the effect of the pre-training data on end task performance. We hope the insights and empirical evidence provided by this work will help future research in learning better visual representations.

Fri Jan 25 2019
Computer Vision
Revisiting Self-Supervised Visual Representation Learning
Unsupervised visual representation learning remains a largely unsolved problem in computer vision research. Among a big body of recently proposed approaches for unsupervised learning of visual representations, a class of self-supervised techniques achieves superior performance on many challenging benchmarks.
0
0
0
Thu Nov 26 2020
Computer Vision
How Well Do Self-Supervised Models Transfer?
Self-supervised visual representation learning has seen huge progress recently. We evaluate the transfer performance of 13 top models on 40 downstream tasks. We find ImageNet Top-1 accuracy to be highly correlated with transfer to many-shot recognition, but increasingly less so for few-shot, dense prediction.
3
12
30
Mon Aug 31 2020
Machine Learning
A Framework For Contrastive Self-Supervised Learning And Designing A New Approach
Contrastive self-supervised learning (CSL) is an approach to learn useful representations. We present a conceptual framework that characterizes CSL approaches. We show the utility of our framework by designing Yet Another DIM (YADIM)
0
0
0
Wed May 12 2021
Computer Vision
When Does Contrastive Visual Representation Learning Work?
Recent self-supervised representation learning techniques have largely closed the gap between supervised and unsupervised learning on ImageNet. The field still lacks widely accepted best practices for replicating this success on other datasets.
1
25
130
Tue Aug 03 2021our pick
Computer Vision
Solo-learn: A Library of Self-supervised Methods for Visual Representation Learning
This paper presents solo-learn, a library of self-supervised methods for visual representation learning. Implemented in Python, using Pytorch andPytorch lightning, the library fits both research and industry needs.
3
56
288
Tue Oct 01 2019
Computer Vision
A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark
Representation learning promises to unlock deep learning for the long tail of vision tasks without expensive labelled datasets. Popular protocols are often too constrained (linear classification) or limited in diversity (ImageNet, CIFAR, Pascal-VOC)
1
0
6
Fri Nov 06 2020
Machine Learning
Underspecification Presents Challenges for Credibility in Modern Machine Learning
ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these behaviors. This ambiguity can lead to instability and poor model behavior in practice.
10
277
1,080
Sat Jun 13 2020
Machine Learning
Bootstrap your own latent: A new approach to self-supervised Learning
Bootstrap Your Own Latent (BYOL) is a new approach to self-supervised image representation learning. BYOL relies on two neural networks, referred to as online and target networks, that interact and learn from each other.
1
30
112
Wed Jun 17 2020
Machine Learning
Big Self-Supervised Models are Strong Semi-Supervised Learners
A key ingredient of our approach is the use of big (deep and wide) networks during pretraining and fine-tuning. We find that, the fewer the labels, the more this approach (task-agnostic use of unlabeled data) benefits from a bigger network. This procedure achieves 73.9% ImageNet top-1 accuracy with
2
13
70
Wed Mar 21 2018
Computer Vision
Unsupervised Representation Learning by Predicting Image Rotations
Deep convolutional neural networks (ConvNets) have transformed the field of computer vision thanks to their unparalleled capacity to learn high level semantic image features. In order to successfully learn those features, they usually require massive amounts of manually labeled data, which is both expensive and impractical to scale. We propose to learn image features by training ConvNets to recognize the 2d rotation that
1
7
53
Thu Aug 31 2017
Computer Vision
EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification
The Sentinel-2 satellite images are openly and freely accessible provided in the Earth observation program Copernicus. The resulting classification system opens a gate towards a number of Earth observation applications. We demonstrate how this classification system can be used for detecting land use and land cover changes.
1
9
33
Wed Apr 10 2019
Machine Learning
Large-Scale Long-Tailed Recognition in an Open World
Open Long-Tailed Recognition (OLTR) is an algorithm that maps an image to a feature space. It respects the closed-world classification while acknowledging the novelty of the open world. On three large-scale OLTR datasets, our method consistently outperforms the state-of-the-art.
1
2
20