Published on Mon Jun 18 2012

Distributed Tree Kernels

Fabio Massimo Zanzotto, Lorenzo Dell'Arciprete

distributed tree kernels (DTK) is a novel method to reduce time and space complexity of tree kernels. DTKs are faster, correlate with tree kernels, and obtain a statistically similar performance in two natural language processing tasks.

0
0
0
Abstract

In this paper, we propose the distributed tree kernels (DTK) as a novel method to reduce time and space complexity of tree kernels. Using a linear complexity algorithm to compute vectors for trees, we embed feature spaces of tree fragments in low-dimensional spaces where the kernel computation is directly done with dot product. We show that DTKs are faster, correlate with tree kernels, and obtain a statistically similar performance in two natural language processing tasks.

Tue May 17 2016
Machine Learning
Word2Vec is a special case of Kernel Correspondence Analysis and Kernels for Natural Language Processing
correspondence analysis (CA) is equivalent to defining a Gini index with appropriately scaled one-hot encoding. Because CA requires excessive memory if applied to numerous categories, CA has not been used for natural language processing. We address this problem by introducing delayed evaluation to randomized singular value decomposition.
0
0
0
Fri Feb 01 2019
Machine Learning
Tree-Sliced Variants of Wasserstein Distances
Optimal transport theory defines a powerful set of tools to compare probabilities. We propose the tree-sliced Wasserstein distance. We also propose a positive definite kernel, and test it against other baselines.
4
0
0
Thu Jan 31 2013
NLP
PyPLN: a Distributed Platform for Natural Language Processing
PyPLN leverages a vast array of NLP and text processing open source tools. In the current (beta) release, it supports English and Portuguese with support to other languages planned for future releases.
0
0
0
Mon Aug 10 2015
Machine Learning
Learning Structural Kernels for Natural Language Processing
Structural kernels are a flexible learning paradigm that has been widely used in Natural Language Processing. Previous approaches mostly rely on setting default values for kernel hyperparameters or using grid search. Bayesian methods allow efficient model selection by maximizing the evidence on the training data through gradient-based methods.
0
0
0
Sun Oct 20 2019
Artificial Intelligence
Rational Kernels: A survey
Many kinds of data are naturally amenable to being treated as sequences. Using such data with statistical learning techniques can often prove to be cumbersome. The framework of rational kernels partly addresses this problem by providing an elegant representation.
0
0
0
Wed Apr 10 2019
Machine Learning
The Weight Function in the Subtree Kernel is Decisive
Tree data are ubiquitous because they model a large variety of situations. The performance of the subtree kernel is improved when the weight of leaves vanishes, which motivates the definition of a new weight function.
0
0
0