Published on Thu Jun 17 2021

Towards Understanding Deep Learning from Noisy Labels with Small-Loss Criterion

Xian-Jin Gui, Wei Wang, Zhang-Hao Tian

Deep neural networks need large amounts of labeled data to achieve good performance. In real-world applications, labels are usually collected from non-experts such as crowdsourcing.

0
0
0
Abstract

Deep neural networks need large amounts of labeled data to achieve good performance. In real-world applications, labels are usually collected from non-experts such as crowdsourcing to save cost and thus are noisy. In the past few years, deep learning methods for dealing with noisy labels have been developed, many of which are based on the small-loss criterion. However, there are few theoretical analyses to explain why these methods could learn well from noisy labels. In this paper, we theoretically explain why the widely-used small-loss criterion works. Based on the explanation, we reformalize the vanilla small-loss criterion to better tackle noisy labels. The experimental results verify our theoretical explanation and also demonstrate the effectiveness of the reformalization.

Fri Aug 14 2020
Machine Learning
Which Strategies Matter for Noisy Label Classification? Insight into Loss and Uncertainty
Label noise is a critical factor that degrades the generalization performance of deep neural networks. We design a new training method that emphasizes clean and informative samples, while minimizing the influence of noise using both loss and uncertainty. The method significantly outperforms other state-of-the-art methods.
0
0
0
Thu Jul 16 2020
Machine Learning
Learning from Noisy Labels with Deep Neural Networks: A Survey
Deep learning has achieved remarkable success in numerous domains with help from large amounts of big data. The quality of data labels is a Concern because of the lack of high-quality labels in many real-world scenarios. As noisy labels severely degrade the generalization performance of deep neural networks, learning from
1
0
1
Wed Dec 02 2020
Machine Learning
SemiNLL: A Framework of Noisy-Label Learning by Semi-Supervised Learning
Deep learning with noisy labels is a challenging task. SemiNLL is a versatile framework that combines SS strategies and SSL models in an end-to-end manner.
0
0
0
Mon May 13 2019
Machine Learning
Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels
Noisy labels are ubiquitous in real-world datasets, which poses a challenge for robustly training deep neural networks. DNNs usually have the high capacity to memorize the noisy labels. The test accuracy is a quadratic function of the noise ratio, which explains the experimental findings.
0
0
0
Sat Dec 05 2020
Machine Learning
A Survey on Deep Learning with Noisy Labels: How to train your model when you cannot trust on the annotations?
Deep learning models have shown significant improvements in different domains. An open issue is their ability to memorize noisy labels during training. Noisy labels are commonly present in data sets automatically collected from the internet. Several approaches have been proposed to improve the training of deep learning models.
0
0
0
Tue May 09 2017
Machine Learning
Learning Deep Networks from Noisy Labels with Dropout Regularization
Large datasets often have unreliable labels. Classifiers trained on mislabeled datasets often exhibit poor performance. We present a simple, effective technique for accounting for label noise.
0
0
0
Wed Oct 25 2017
Machine Learning
mixup: Beyond Empirical Risk Minimization
Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and
3
1
4
Thu Nov 10 2016
Machine Learning
Understanding deep learning requires rethinking generalization
Conventional wisdom attributes small generalization error to properties of the model family or to the regularization techniques used during training. We show how these traditional approaches fail to explain why large neural networks generalize well in practice.
4
1
2
Mon May 06 2019
Machine Learning
MixMatch: A Holistic Approach to Semi-Supervised Learning
MixMatch is a new algorithm that works by guessing low-entropy labels and mixing labeled and unlabeled data. It obtains state-of-the-art results by a large margin across many datasets and labeled data amounts.
1
0
1
Thu Apr 25 2019
Computer Vision
Unsupervised Label Noise Modeling and Loss Correction
Convolutional neural networks trained with stochastic gradient methods have been shown to easily fit random labels. When there are a mixture of correct and mislabelled targets, these networks tend to fit the former before the latter. This suggests using a two- component mixture model as an
0
0
0
Wed Dec 11 2019
Machine Learning
Image Classification with Deep Learning in the Presence of Noisy Labels: A Survey
Deep neural networks are known to be relatively robust to label noise. However, their tendency to overfit data makes them vulnerable to memorizing even random noise. Algorithms in the first group aim to estimate the noise structure and use this information to avoid the adverse effects of noisy labels.
0
0
0
Thu Dec 14 2017
Computer Vision
MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels
Recent deep networks are capable of memorizing the entire data even when the labels are completely random. To overcome the overfitting on corrupted labels, we propose a novel technique of learning another neural network, called MentorNet.
0
0
0