Published on Thu Jul 12 2018

Topic Diffusion Discovery based on Sparseness-constrained Non-negative Matrix Factorization

Yihuang Kang, Keng-Pei Lin, I-Ling Cheng

Researchers have been overwhelmed by ever-increasing volume of articles produced by different research communities. We consider a novel topic diffusion discovery technique thatorporates sparseness-constrained Non-negative Matrix Factorization. This approach can extract more prominent topics from large article databases.

0
0
0
Abstract

Due to recent explosion of text data, researchers have been overwhelmed by ever-increasing volume of articles produced by different research communities. Various scholarly search websites, citation recommendation engines, and research databases have been created to simplify the text search tasks. However, it is still difficult for researchers to be able to identify potential research topics without doing intensive reviews on a tremendous number of articles published by journals, conferences, meetings, and workshops. In this paper, we consider a novel topic diffusion discovery technique that incorporates sparseness-constrained Non-negative Matrix Factorization with generalized Jensen-Shannon divergence to help understand term-topic evolutions and identify topic diffusions. Our experimental result shows that this approach can extract more prominent topics from large article databases, visualize relationships between terms of interest and abstract topics, and further help researchers understand whether given terms/topics have been widely explored or whether new topics are emerging from literature.

Thu Mar 25 2021
NLP
Term-community-based topic detection with variable resolution
Network-based procedures for topic detection in huge text collections offer an intuitive alternative to probabilistic topic models. We demonstrate the application of our method with a widely used corpus of general news articles and show the results of detailed social-sciences expert evaluations of detected topics.
1
0
0
Wed Mar 21 2018
Machine Learning
Scalable Generalized Dynamic Topic Models
Dynamic topic models (DTMs) model the evolution of prevalent themes in literature, online media, and other forms of text over time. DTMs assume that word co-occurrence statistics change continuously and therefore impose continuous stochastic process priors on their model parameters.
0
0
0
Fri Mar 15 2013
Machine Learning
Topic Discovery through Data Dependent and Random Projections
We present algorithms for topic modeling based on the geometry of word-frequency patterns. This perspective gains significance under the so called separability condition. We present several experiments on synthetic and real-world datasets to demonstrate the merits of our scheme.
0
0
0
Thu Sep 29 2016
Machine Learning
Topic Browsing for Research Papers with Hierarchical Latent Tree Analysis
An online catalog of research papers has been automatically categorized by a topic model. The catalog contains 7719 papers from the proceedings of two artificial intelligence conferences from 2000 to 2015.
0
0
0
Mon Jun 12 2017
Machine Learning
Topic supervised non-negative matrix factorization
Topic models have been extensively used to organize and interpret the contents of large, unstructured corpora of text documents. Although topic models often perform well on traditional training vs. test set evaluations, it is often the case that the results of a topic model do not align with human interpretation. This interpretability fallacy is
0
0
0
Fri Oct 23 2020
Machine Learning
Topic Space Trajectories: A case study on machine learning literature
topic space trajectories are a structure that allows for the comprehensible tracking of research topics. We demonstrate how these trajectories can be interpreted based on eight different analysis approaches. We show the applicability of our approach on a publication corpus spanning 50 years of machine learning research from 32 publication venues.
0
0
0