Published on Mon Oct 12 2020

Explaining Neural Matrix Factorization with Gradient Rollback

Carolin Lawrence, Timo Sztyler, Mathias Niepert

gradient rollback is a general approach for influence estimation for neural models. Each parameter update step during gradient descent touches a smaller number of parameters, even if the overall number of parameters is large. We show that gradient roll back is highly efficient at both training and test time.

1
0
1
Abstract

Explaining the predictions of neural black-box models is an important problem, especially when such models are used in applications where user trust is crucial. Estimating the influence of training examples on a learned neural model's behavior allows us to identify training examples most responsible for a given prediction and, therefore, to faithfully explain the output of a black-box model. The most generally applicable existing method is based on influence functions, which scale poorly for larger sample sizes and models. We propose gradient rollback, a general approach for influence estimation, applicable to neural models where each parameter update step during gradient descent touches a smaller number of parameters, even if the overall number of parameters is large. Neural matrix factorization models trained with gradient descent are part of this model class. These models are popular and have found a wide range of applications in industry. Especially knowledge graph embedding methods, which belong to this class, are used extensively. We show that gradient rollback is highly efficient at both training and test time. Moreover, we show theoretically that the difference between gradient rollback's influence approximation and the true influence on a model's behavior is smaller than known bounds on the stability of stochastic gradient descent. This establishes that gradient rollback is robustly estimating example influence. We also conduct experiments which show that gradient rollback provides faithful explanations for knowledge base completion and recommender datasets.

Tue Jun 08 2021
Machine Learning
Multi-output Gaussian Processes for Uncertainty-aware Recommender Systems
Recommender systems are often designed based on a collaborative filtering approach, where user preferences are predicted by modelling interactions between users and items. We propose a novel approach combining the representation learning paradigm of collaborative filtering with multi-output Gaussian processes in a joint framework to generate uncertainty-aware
1
0
0
Tue Nov 20 2018
Artificial Intelligence
Explaining Latent Factor Models for Recommendation with Influence Functions
Latent factor models (LFMs) such as matrix factorization achieve the state-of-the-art performance among various Collaborative Filtering (CF)approaches for recommendation. Despite the high recommendation accuracy of LFMs, a critical issue to be resolved is the lack of
0
0
0
Mon Jun 03 2019
Machine Learning
The FacT: Taming Latent Factor Models for Explainability with Factorization Trees
Latent factor models have achieved great success in personalized recommendations, but they are notoriously difficult to explain. In this work, we integrate regression trees to guide the learning of latent factor models for recommendation. We use the learnt tree structure to explain the resulting latent factors.
0
0
0
Thu Mar 07 2019
Machine Learning
The Variational Predictive Natural Gradient
Variational inference transforms posterior inference into parametric optimization. It enables the use of latent variable models where otherwise it would be impractical. Traditional natural gradients based on the Variational Predictive Natural Gradient fail to correct for correlations when the approximation is not the true.
0
0
0
Tue Oct 17 2017
Machine Learning
On the challenges of learning with inference networks on sparse, high-dimensional data
We identify a crucial problem when modeling large, sparse, high-dimensional datasets -- underfitting. We propose methods to tackle it via iterative optimization. The proposed techniques drastically improve the ability of these powerful models to fit sparse data.
0
0
0
Fri Apr 07 2017
Machine Learning
TransNets: Learning to Transform for Recommendation
Deep learning methods have been shown to improve the performance of recommending systems over traditional methods. We show that (unsurprisingly) much of the predictive value of review text comes from reviews of the target user for the target item.
0
0
0
Thu Sep 03 2015
Machine Learning
Train faster, generalize better: Stability of stochastic gradient descent
We show that parametric models trained by a stochastic gradient method (SGM) with few iterations have vanishing generalization error. We prove our results by arguing that SGM is algorithmically stable in the sense of Bousquet and Elisseeff.
1
3
6
Tue Mar 14 2017
Machine Learning
Understanding Black-box Predictions via Influence Functions
Influence functions are a classic technique from robust statistics. They can be used to trace a model's prediction through the learning algorithm and back to its training data. They are useful for understanding model behavior, debugging models, detecting dataset errors.
3
0
4
Mon Dec 22 2014
Machine Learning
Adam: A Method for Stochastic Optimization
Adam is an algorithm for first-order gradient-based optimization of stochastic objective functions. The method is straightforward to implement and has little memory requirements. It is well suited for problems that are large in terms of data and parameters.
3
0
2
Tue May 30 2017
Artificial Intelligence
Knowledge Base Completion: Baselines Strike Back
The accuracy of almost all models published on the FB15k can be outperformed by an appropriately tuned baseline. Our findings cast doubt on the claim that the performance improvements of recent models are due to architectural changes as opposed to hyper-parameter tuning.
0
0
0
Fri Feb 28 2020
Machine Learning
A Survey on Knowledge Graph-Based Recommender Systems
In recent years, generating recommendations with the knowledge graph as side information has attracted considerable interest. Such an approach can not only alleviate the abovementioned issues for a more accurate recommendation, but also provide Explanations for recommended items.
0
0
0
Tue Nov 24 2015
Machine Learning
The Limitations of Deep Learning in Adversarial Settings
Deep learning takes advantage of large datasets and computationally efficient training algorithms to outperform other approaches at various machine learning tasks. However, imperfections in the training phase of deep neural networks make them vulnerable to adversarial samples: inputs crafted by adversaries with the intent of causing them to misclassify.
1
0
0