Published on Mon Jun 01 2020

Semi-Supervised Hierarchical Drug Embedding in Hyperbolic Space

Ke Yu, Shyam Visweswaran, Kayhan Batmanghelich

A drug hierarchy is a valuable source that encodes human knowledge of drug relations. Learning accurate drug representation is essential for tasks such as drug repositioning and prediction of drug side-effects. Currently described drug representations cannot place novel molecules in a drug hierarchy. We develop a semi-supervised drug embedding that incorporates two sources of information.

1
1
1
Abstract

Learning accurate drug representation is essential for tasks such as computational drug repositioning and prediction of drug side-effects. A drug hierarchy is a valuable source that encodes human knowledge of drug relations in a tree-like structure where drugs that act on the same organs, treat the same disease, or bind to the same biological target are grouped together. However, its utility in learning drug representations has not yet been explored, and currently described drug representations cannot place novel molecules in a drug hierarchy. Here, we develop a semi-supervised drug embedding that incorporates two sources of information: (1) underlying chemical grammar that is inferred from molecular structures of drugs and drug-like molecules (unsupervised), and (2) hierarchical relations that are encoded in an expert-crafted hierarchy of approved drugs (supervised). We use the Variational Auto-Encoder (VAE) framework to encode the chemical structures of molecules and use the knowledge-based drug-drug similarity to induce the clustering of drugs in hyperbolic space. The hyperbolic space is amenable for encoding hierarchical concepts. Both quantitative and qualitative results support that the learned drug embedding can accurately reproduce the chemical structure and induce the hierarchical relations among drugs. Furthermore, our approach can infer the pharmacological properties of novel molecules by retrieving similar drugs from the embedding space. We demonstrate that the learned drug embedding can be used to find new uses for existing drugs and to discover side-effects. We show that it significantly outperforms baselines in both tasks.

Thu Apr 30 2020
Machine Learning
Modeling Pharmacological Effects with Multi-Relation Unsupervised Graph Embedding
Drug repositioning (or drug repurposing) is afundamental problem for the identification of new opportunities for the use of already approved or failed drugs. We present a method based on amulti-relation unsupervised graph embedding model that learns latent representations for drugs and diseases.
0
0
0
Tue Sep 29 2020
Machine Learning
ChemoVerse: Manifold traversal of latent spaces for novel molecule discovery
Recent advances in generative models using neural networks and machine learning are being widely used to design virtual libraries of drug-like compounds. Different heuristics and scores such as the Tanimoto coefficient, synthetic accessibility, binding activity, and QED drug-likeness can be incorporated.
0
0
0
Sat Mar 20 2021
Machine Learning
Using Molecular Embeddings in QSAR Modeling: Does it Make a Difference?
0
0
0
Fri Mar 19 2021
Machine Learning
Predicting Drug-Drug Interactions from Heterogeneous Data: An Embedding Approach
0
0
0
Sat Apr 28 2018
Artificial Intelligence
Drug Similarity Integration Through Attentive Multi-view Graph Auto-Encoders
Drug similarity has been studied to support downstream clinical tasks such asferring novel properties of drugs. It is challenging to integrate these heterogeneous, noisy, nonlinear-related information to learn accurate similarity measures. There is a trade-off between accuracy and interpretability.
0
0
0
Thu Mar 11 2021
Machine Learning
Scaffold Embeddings: Learning the Structure Spanned by Chemical Fragments, Scaffolds and Compounds
Molecules have seemed like a natural fit to deep learning's tendency to handle a complex structure through representation learning, given enough data. This often continuous representation is not natural for understanding chemical space as a domain and is particular to samples and their differences. We focus on exploring a natural structure
3
1
0
Thu Nov 02 2017
Machine Learning
Neural Discrete Representation Learning
VQ-VAE differs from VAEs in two key ways. The encoder network outputs discrete, rather than continuous, codes. The prior is learnt rather than static. The model can generate high quality images, videos, and speech.
2
5
32
Fri Dec 20 2013
Machine Learning
Auto-Encoding Variational Bayes
We introduce a stochastic variational inference and learning algorithm that scales to large datasets. We show that a reparameterization of the variational lower bound yields a lower bound estimator.
2
0
3
Mon Dec 22 2014
Machine Learning
Adam: A Method for Stochastic Optimization
Adam is an algorithm for first-order gradient-based optimization of stochastic objective functions. The method is straightforward to implement and has little memory requirements. It is well suited for problems that are large in terms of data and parameters.
3
0
2
Thu Nov 19 2015
Machine Learning
Generating Sentences from a Continuous Space
The standard recurrent neural network language model (RNNLM) generates sentences one word at a time and does not work from an explicit global sentence representation. In this work, we introduce and study an RNN-based variational autoencoder generative model that incorporates distributed
1
0
1
Tue Apr 02 2019
Machine Learning
Analyzing Learned Molecular Representations for Property Prediction
Advancements in neural machinery have led to a wide range of algorithmic solutions for molecular property prediction. Two classes of models in particular have yielded promising results. We benchmark models extensively on 19 public and 16 proprietary industrial datasets spanning a wide variety of chemical endpoints.
0
0
0
Wed Jan 16 2019
Machine Learning
Lagging Inference Networks and Posterior Collapse in Variational Autoencoders
The variational autoencoder (VAE) is a popular combination of deep latent variable model and accompanying variational learning technique. VAE training often results in a degenerate local optimum known as "posterior collapse" where the model learns to ignore the latent variable and approximate posterior mimics the prior.
0
0
0