Published on Mon Mar 15 2021

How to distribute data across tasks for meta-learning?

Alexandru Cioba, Michael Bromberg, Qian Wang, Ritwik Niyogi, Georgios Batzolis, Jezabel Garcia, Da-shan Shiu, Alberto Bernacchia

Meta-learning models transfer knowledge acquired from previous tasks to learn new ones. They are trained on benchmarks with a fixed number of data points per task. Since labelling of data is expensive, finding the optimal allocation of labels across training tasks may reduce costs.

0
0
0
Abstract

Meta-learning models transfer the knowledge acquired from previous tasks to quickly learn new ones. They are trained on benchmarks with a fixed number of data points per task. This number is usually arbitrary and it is unknown how it affects performance at testing. Since labelling of data is expensive, finding the optimal allocation of labels across training tasks may reduce costs. Given a fixed budget of labels, should we use a small number of highly labelled tasks, or many tasks with few labels each? Should we allocate more labels to some tasks and less to others? We show that: 1) If tasks are homogeneous, there is a uniform optimal allocation, whereby all tasks get the same amount of data; 2) At fixed budget, there is a trade-off between number of tasks and number of data points per task, with a unique and constant optimum; 3) When trained separately, harder task should get more data, at the cost of a smaller number of tasks; 4) When training on a mixture of easy and hard tasks, more data should be allocated to easy tasks. Interestingly, Neuroscience experiments have shown that human visual skills also transfer better from easy tasks. We prove these results mathematically on mixed linear regression, and we show empirically that the same results hold for few-shot image classification on CIFAR-FS and mini-ImageNet. Our results provide guidance for allocating labels across tasks when collecting data for meta-learning.

Wed Jun 16 2021
Machine Learning
Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation
Multi-task learning (MTL) aims to improve the generalization of several related tasks by learning them jointly. Modern meta-learning allows unseen tasks with limited labels during the test phase, in the hope of fast adaptation over them. We believe this work could help bridge the gap between these two learning paradigms.
4
2
26
Fri Jul 02 2021
Computer Vision
Disentangling Transfer and Interference in Multi-Domain Learning
The benefits of transfer in multi-domain learning have not been adequately studied. Learning multiple domains could be beneficial or these domains couldInterfere with each other given limited network capacity. We propose new metrics disentangling interference and Transfer and set up experimental protocols.
0
0
0
Thu Sep 19 2019
Machine Learning
Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML
An important research direction in machine learning has centered around developing meta-learning algorithms to tackle few-shot learning. An especially successful algorithm has been Model Agnostic Meta-Learning (MAML) The ANIL (Almost No Inner Loop) algorithm is a simplification of MAML.
0
0
0
Sat Jul 27 2019
Machine Learning
Uncertainty in Model-Agnostic Meta-Learning using Variational Inference
The proposed algorithm employs a gradient-based variational inference to infer the posterior of model parameters to a new task. The algorithm can be applied to any model architecture and can be implemented in various machine learning paradigms, including regression and classification.
0
0
0
Mon Dec 09 2019
Artificial Intelligence
Meta-Learning without Memorization
The ability to learn new concepts with small amounts of data is a critical aspect of intelligence that has proven challenging for deep learning methods. In some domains, this makes meta-learning entirely inapplicable. We address this challenge by designing a meta-regularization objective that places precedence on data-driven adaptation.
0
0
0
Mon Sep 09 2019
Artificial Intelligence
Meta-learnt priors slow down catastrophic forgetting in neural networks
Current training regimes for deep learning usually involve exposure to asingle task / dataset at a time. Here we show that catastrophic forgetting can be mitigated in a meta-learning context, by exposing a neural network to multiple tasks in a sequential manner during training.
0
0
0
Sat Apr 11 2020
Machine Learning
Meta-Learning in Neural Networks: A Survey
The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent years. Contrary to conventional approaches to AI where tasks are solved from scratch using a fixed learning algorithm, meta- learning aims to improve the learning algorithm itself.
1
90
462
Mon Jun 13 2016
Machine Learning
Matching Networks for One Shot Learning
The standard supervised deep learning paradigm does not offer a satisfactory solution for learning new concepts rapidly from little data. We employ ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories.
1
0
9
Fri Nov 10 2017
Machine Learning
Few-Shot Learning with Graph Neural Networks
We define a graph neural network architecture that generalizes several of the recently proposed few-shot learning models. Besides providing improved numerical performance, our framework is easily extended to semi-supervised or active learning, demonstrating the ability of graph-based models to operate well on 'relational' tasks.
1
2
6
Thu Mar 09 2017
Neural Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
The goal of meta-learning is to train a model on a variety of learning tasks. The parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance.
1
0
3
Mon Dec 22 2014
Machine Learning
Adam: A Method for Stochastic Optimization
Adam is an algorithm for first-order gradient-based optimization of stochastic objective functions. The method is straightforward to implement and has little memory requirements. It is well suited for problems that are large in terms of data and parameters.
3
0
2
Wed Dec 04 2019
Machine Learning
Deep Double Descent: Where Bigger Models and More Data Hurt
We show that a variety of modern deep learning tasks exhibit a "double-descent" phenomenon. As we increase model size, performance first gets worse and then gets better. We unify the above phenomena by defining a new complexity measure.
3
0
1