Published on Mon Jun 15 2020

Robust Variational Autoencoder for Tabular Data with Beta Divergence

Haleh Akrami, Sergul Aydore, Richard M. Leahy, Anand A. Joshi

The training data itself can contain outliers. Outliers can disproportionately affect the training process of VAEs. We propose a robust variational autoencoder for tabular data sets with categorical and continuous features that is robust to outliers in training data.

0
0
0
Abstract

We propose a robust variational autoencoder with divergence for tabular data (RTVAE) with mixed categorical and continuous features. Variational autoencoders (VAE) and their variations are popular frameworks for anomaly detection problems. The primary assumption is that we can learn representations for normal patterns via VAEs and any deviation from that can indicate anomalies. However, the training data itself can contain outliers. The source of outliers in training data include the data collection process itself (random noise) or a malicious attacker (data poisoning) who may target to degrade the performance of the machine learning model. In either case, these outliers can disproportionately affect the training process of VAEs and may lead to wrong conclusions about what the normal behavior is. In this work, we derive a novel form of a variational autoencoder for tabular data sets with categorical and continuous features that is robust to outliers in training data. Our results on the anomaly detection application for network traffic datasets demonstrate the effectiveness of our approach.

Thu May 23 2019
Machine Learning
Robust Variational Autoencoder
Robustness of autoencoders to outliers is critical for generating a reliable representation of particular data types. Our robust VAE is based on beta-divergence rather than the standard Kullback-Leibler (KL) divergence.
0
0
0
Thu Jul 16 2020
Machine Learning
Detecting Out-of-distribution Samples via Variational Auto-encoder with Reliable Uncertainty Estimation
In unsupervised learning, variational auto-encoders (VAEs) are an influential class of deep generative models with rich representational power. VAEs assign higher likelihood to out-of-distribution (OOD) inputs than in- Distribution (ID) inputs.
0
0
0
Thu Jun 11 2020
Machine Learning
A Generalised Linear Model Framework for -Variational Autoencoders based on Exponential Dispersion Families
We introduce a theoretical framework that is based on a connection between -VAE and generalized linear models (GLM) The equality between the activation function of a and the inverse of a GLM enables us to provide a systematic generalization of the loss analysis.
0
0
0
Mon Dec 07 2020
Machine Learning
Autoencoding Variational Autoencoder
This paper shows that a (nominally trained) VAE does not necessarily amortize inference for typical samples that it is capable of generating. The method can be used to train a VAE model from scratch or given an already trained VAE.
0
0
0
Wed Feb 06 2019
Machine Learning
BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling
The Bidirectional-Inference Variational Autoencoder (BIVA) is characterized by a skip-connected generative model and a bidirectional stochastic inference path. We show that BIVA reaches state-of-the-art test likelihoods.
0
0
0
Fri Jun 16 2017
Machine Learning
Hidden Talents of the Variational Autoencoder
Variational autoencoders (VAE) represent a popular, flexible form of deepgeneration. VAE can be stochastically fit to samples from a given random process. As a highly non-convex model, it remains unclear exactly how minima of the underlying energy relate
0
0
0