Published on Mon Aug 23 2021

Genetic Programming for Manifold Learning: Preserving Local Topology

Andrew Lensen, Bing Xue, Mengjie Zhang

Manifold learning algorithms can discover a lower-dimensional representation (embedding) of a high-dimensional dataset. State-of-the-art manifold learning methods directly optimise an embedding without mapping between the original space and the embedded space. This makes interpretability - a key requirement

0
0
0
Abstract

Manifold learning methods are an invaluable tool in today's world of increasingly huge datasets. Manifold learning algorithms can discover a much lower-dimensional representation (embedding) of a high-dimensional dataset through non-linear transformations that preserve the most important structure of the original data. State-of-the-art manifold learning methods directly optimise an embedding without mapping between the original space and the discovered embedded space. This makes interpretability - a key requirement in exploratory data analysis - nearly impossible. Recently, genetic programming has emerged as a very promising approach to manifold learning by evolving functional mappings from the original space to an embedding. However, genetic programming-based manifold learning has struggled to match the performance of other approaches. In this work, we propose a new approach to using genetic programming for manifold learning, which preserves local topology. This is expected to significantly improve performance on tasks where local neighbourhood structure (topology) is paramount. We compare our proposed approach with various baseline manifold learning methods and find that it often outperforms other methods, including a clear improvement over previous genetic programming approaches. These results are particularly promising, given the potential interpretability and reusability of the evolved mappings.

Sun Jan 05 2020
Neural Networks
Multi-Objective Genetic Programming for Manifold Learning: Balancing Quality and Dimensionality
Manifold learning techniques have become increasingly valuable as data continues to grow. But state-of-the-art manifold learning algorithms are opaque in how they perform this transformation. We propose a multi-objective approach that automatically balances the competing objectives of manifold quality and dimensionality.
0
0
0
Fri Feb 08 2019
Neural Networks
Can Genetic Programming Do Manifold Learning Too?
GP-MaL is a programming approach to manifold learning that evolves functional mappings from a high-dimensional space to a lower dimensional space. We show that GP- maL is competitive with existing manifold learning algorithms, while producing models that can be used on unseen data.
0
0
0
Thu Jun 03 2021
Machine Learning
A Discussion On the Validity of Manifold Learning
Dimensionality reduction (DR) and manifold learning (ManL) have been applied extensively in many machine learning tasks. The understanding of whether DR and ManL models can generate valid learning results remains unclear. We identify a fundamental problem of these methods.
3
1
3
Tue Oct 22 2019
Neural Networks
Genetic Programming for Evolving Similarity Functions for Clustering: Representations and Analysis
Clustering is a difficult and widely-studied data mining task. Nearly all algorithms use a similarity measure such as a distance metric to decide which instances to assign to the same cluster. We propose a new approach to automatically evolving similarity functions for a given clustering algorithm by using genetic programming.
0
0
0
Sun Oct 10 2010
Neural Networks
Multi-Objective Genetic Programming Projection Pursuit for Exploratory Data Modeling
Feature extraction is a crucial process which aims to find a suitable data representation that increases the performance of machine learning algorithms. According to the curse of dimensionality the number of samples needed for a classification task increases Exponentially as number of dimensions (variables, features) increases.
0
0
0
Fri Aug 16 2019
Neural Networks
Evolutionary Computation, Optimization and Learning Algorithms for Data Science
A large number of engineering, science and computational problems have yet to be solved in a computationally efficient way. We focus on data science as a crucial area, specifically focusing on a curse of terribledimensionality (CoD) which is due to the large amount of data.
0
0
0
Fri Feb 09 2018
Machine Learning
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology.
1
0
24
Sun Jun 24 2012
Machine Learning
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data representation. Different representations can entangle and hide more or less the different explanatory factors of variation behind the data. This motivates longer-term unanswered questions about the appropriate objectives for learning good representations.
1
1
2
Wed Mar 30 2016
Computer Vision
Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs
New approach for the approximate K-nearest neighbor search based on navigable small world graphs with controllable hierarchy. Proposal is fully graph-based, without any need for additional search structures.
3
0
2
Fri Feb 08 2019
Neural Networks
Can Genetic Programming Do Manifold Learning Too?
GP-MaL is a programming approach to manifold learning that evolves functional mappings from a high-dimensional space to a lower dimensional space. We show that GP- maL is competitive with existing manifold learning algorithms, while producing models that can be used on unseen data.
0
0
0
Sun Jan 05 2020
Neural Networks
Multi-Objective Genetic Programming for Manifold Learning: Balancing Quality and Dimensionality
Manifold learning techniques have become increasingly valuable as data continues to grow. But state-of-the-art manifold learning algorithms are opaque in how they perform this transformation. We propose a multi-objective approach that automatically balances the competing objectives of manifold quality and dimensionality.
0
0
0
Mon Jan 02 2012
Machine Learning
Scikit-learn: Machine Learning in Python
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms. Emphasis is put on ease of use,performance, documentation, and API consistency.
0
0
0