Published on Tue Dec 10 2019

Zero-shot Text Classification With Generative Language Models

Raul Puri, Bryan Catanzaro

This work investigates the use of natural language to enable zero-shot model adaptation to new tasks. We use text and metadata from social commenting platforms as a source for a simple pretraining task. We then train the model with natural language descriptions of classification tasks as input.

0
0
0
Abstract

This work investigates the use of natural language to enable zero-shot model adaptation to new tasks. We use text and metadata from social commenting platforms as a source for a simple pretraining task. We then provide the language model with natural language descriptions of classification tasks as input and train it to generate the correct answer in natural language via a language modeling objective. This allows the model to generalize to new classification tasks without the need for multiple multitask classification heads. We show the zero-shot performance of these generative language models, trained with weak supervision, on six benchmark text classification datasets from the torchtext library. Despite no access to training data, we achieve up to a 45% absolute improvement in classification accuracy over random or majority class baselines. These results show that natural language can serve as simple and powerful descriptors for task adaptation. We believe this points the way to new metalearning strategies for text problems.

Tue Dec 22 2020
Machine Learning
Few-Shot Text Generation with Pattern-Exploiting Training
We adapt Pattern-Exploiting Training (PET), a recently proposed few-shot approach, for finetuning generative language models on text generation tasks. On several text summarization and headline generation tasks, our proposed variant of PET gives consistent improvements over a strong baseline.
0
0
0
Sun Apr 18 2021
NLP
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
Large-scale language models such as GPT-3 are excellent few-shot learners. We propose utilizing soft-labels predicted by the language models. We perform data augmentation experiments on diverse classification tasks.
1
0
1
Tue Oct 01 2019
Machine Learning
Latent-Variable Generative Models for Data-Efficient Text Classification
Generative classifiers offer potential advantages over their discriminative counterparts. We introduce discrete latent variables into the generative story and explore several graphical model configurations. We empirically characterize the performance of our models on six text classification datasets.
0
0
0
Sat Apr 10 2021
NLP
Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections
Large pre-trained language models (LMs) such as GPT-3 have acquired a surprising ability to perform zero-shot learning. To address this weakness, we propose meta-tuning, which directly optimizes the metrics. We focus on classification tasks, and construct the metrics by aggregating 43 existing
4
28
101
Fri Aug 06 2021
NLP
Towards Zero-shot Language Modeling
Can we construct a neural model that is inductively biased towards learning human languages? Motivated by this question, we aim at constructing an informative prior over neural weights. We infer this distribution from a sample of typologically diverse training languages via Laplace approximation.
0
0
0
Thu Aug 22 2019
NLP
When Low Resource NLP Meets Unsupervised Language Model: Meta-pretraining Then Meta-learning for Few-shot Text Classification
This paper addresses problems using meta-learning and unsupervised language models. We show that our simple approach produces a state-of-the-art performance on a well-studied sentiment classification dataset. It can thus be further suggested that pretraining could be a promising solution for many
0
0
0