Published on Tue Jul 20 2021

Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning

Timo Milbich, Karsten Roth, Samarth Sinha, Ludwig Schmidt, Marzyeh Ghassemi, Björn Ommer

Deep Metric Learning (DML) aims to find representations suitable for zero-shot transfer to unknown test distributions. We systematically construct train-test splits of progressivelyincreasing difficulty and present the ooDML benchmark to characterize generalization under out-of-distribution shifts in DML.

3
15
53
Abstract

Deep Metric Learning (DML) aims to find representations suitable for zero-shot transfer to a priori unknown test distributions. However, common evaluation protocols only test a single, fixed data split in which train and test classes are assigned randomly. More realistic evaluations should consider a broad spectrum of distribution shifts with potentially varying degree and difficulty. In this work, we systematically construct train-test splits of increasing difficulty and present the ooDML benchmark to characterize generalization under out-of-distribution shifts in DML. ooDML is designed to probe the generalization performance on much more challenging, diverse train-to-test distribution shifts. Based on our new benchmark, we conduct a thorough empirical analysis of state-of-the-art DML methods. We find that while generalization tends to consistently degrade with difficulty, some methods are better at retaining performance as the distribution shift increases. Finally, we propose few-shot DML as an efficient way to consistently improve generalization in response to unknown test shifts presented in ooDML. Code available here: https://github.com/Confusezius/Characterizing_Generalization_in_DeepMetricLearning.

Tue Jun 30 2020
Machine Learning
Uniform Priors for Data-Efficient Transfer
The ability to perform few or zero-shot adaptation to novel tasks is important for the scalability and deployment of machine learning models. We show that features that are most transferable have high uniformity in the embedding space and propose a uniformity regularization scheme.
0
0
0
Fri Nov 22 2019
Machine Learning
Instance Cross Entropy for Deep Metric Learning
instance cross entropy (ICE) measures the difference between an estimated instance-level matching distribution and its ground-truth one. ICE is scalable to infinite training data as it learns on mini-batches iteratively and is independent of training set size.
0
0
0
Wed Jun 09 2021
Machine Learning
It Takes Two to Tango: Mixup for Deep Metric Learning
Metric learning involves learning a discriminative representation such that embeddings of similar classes are encouraged to be close. State-of-the-art methods focus mostly on sophisticated loss functions or mining strategies. This task is challenging because the loss functions used in metric learning are not additive.
5
2
1
Sun Feb 14 2021
Artificial Intelligence
Exploring Adversarial Robustness of Deep Metric Learning
Deep Metric Learning (DML) involves learning a distance metric between pairs of samples. We tackle the primary challenge of the metric losses being dependent on the samples in a mini-batch. We demonstrate 5-76 fold increases in adversarial accuracy, and outperform an existing DML model.
0
0
0
Thu Dec 03 2020
Computer Vision
Meta-Generating Deep Attentive Metric for Few-shot Classification
Learning to generate a task-aware base learner proves a promising direction to deal with few-shot learning (FSL) problem. Existing methods mainly focus on.generating an embedding model utilized with a fixed metric (eg. cosine distance) for nearest neighbour classification or directly generating a linear.classier.
0
0
0
Thu Dec 26 2019
Machine Learning
Variational Metric Scaling for Metric-Based Meta-Learning
Metric-based meta-learning has attracted a lot of attention due to its effectiveness and efficiency in few-shot learning. Recent studies show that metric scaling plays a crucial role in the performance of metric-based meta-learning algorithms.
0
0
0
Fri Feb 26 2021
Computer Vision
Learning Transferable Visual Models From Natural Language Supervision
A new way to train computer vision systems. Pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn representations from scratch on a dataset of 400 million (image,text) pairs.
10
83
361
Thu May 28 2020
NLP
Language Models are Few-Shot Learners
GPT-3 is an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model. It can perform tasks without gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model.
20
21
235
Tue Dec 24 2019
Computer Vision
Big Transfer (BiT): General Visual Representation Learning
Big Transfer (BiT) achieves 87.5% top-1 accuracy on ILSVRC-2012, 99.4% on CIFAR-10, and 76.3% on VTAB. BiT performs well across a surprisingly wide range of data regimes.
4
43
145
Thu Oct 22 2020
Computer Vision
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks. ViT requires substantially fewer computational resources to train.
16
26
110
Thu Oct 11 2018
NLP
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT is designed to pre-train deep                bidirectional representations from unlabeled text. It can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
13
8
15
Wed Jun 17 2020
Computer Vision
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments
Unsupervised image representations have significantly reduced the gap with supervised pretraining, notably with the recent achievements of contrastive learning methods. These contrastive methods typically work online and rely on a large number of explicit pairwise feature comparisons. In this paper, we propose an online algorithm, SwAV,
1
0
4