Published on Fri Jun 16 2017

Deriving Compact Laws Based on Algebraic Formulation of a Data Set

Wenqing Xu, Mark Stalzer

In various subjects, there exist compact and consistent relationships between input and output parameters. Discovering the relationships in a data set is of great interest in many fields, such as physics,chemistry, and finance. In this paper, we develop an innovative approach in discovering compact laws.

0
0
0
Abstract

In various subjects, there exist compact and consistent relationships between input and output parameters. Discovering the relationships, or namely compact laws, in a data set is of great interest in many fields, such as physics, chemistry, and finance. While data discovery has made great progress in practice thanks to the success of machine learning in recent years, the development of analytical approaches in finding the theory behind the data is relatively slow. In this paper, we develop an innovative approach in discovering compact laws from a data set. By proposing a novel algebraic equation formulation, we convert the problem of deriving meaning from data into formulating a linear algebra model and searching for relationships that fit the data. Rigorous proof is presented in validating the approach. The algebraic formulation allows the search of equation candidates in an explicit mathematical manner. Searching algorithms are also proposed for finding the governing equations with improved efficiency. For a certain type of compact theory, our approach assures convergence and the discovery is computationally efficient and mathematically precise.

Thu Jan 10 2013
Machine Learning
Discovering Multiple Constraints that are Frequently Approximately Satisfied
High-dimensional data.sets can be modelled by assuming that there are many different linear constraints. The probability of a data vector under the model is then proportional to the product of the probabilities of its constraint violations.
0
0
0
Tue Dec 08 2015
Artificial Intelligence
Learning Discrete Bayesian Networks from Continuous Data
Learning Bayesian networks from raw data can help provide insights into the relationships between variables. Many Bayesian network structure structure algorithms assume all random variables are discrete. The choice of discretization policy has significant impact on the accuracy, speed and interpretability of models.
0
0
0
Wed Jan 24 2018
Artificial Intelligence
Intrinsic Dimension of Geometric Data Sets
The curse of dimensionality is a phenomenon frequently observed in machine learning (ML) and knowledge discovery (KD) There is a large body of literature investigating its origin and impact, using methods from mathematics and computer science. The present work provides a comprehensive study of the intrinsic geometry of a data
0
0
0
Fri Dec 11 2020
Machine Learning
Learning physically consistent mathematical models from data using group sparsity
High noise levels, sensor-induced correlations, and strong inter-system variability can render data-driven models nonsensical or physically inconsistent. We propose a statistical learning framework based on group-sparse regression that can be used to enforce conservation laws and ensure model equivalence.
0
0
0
Wed Aug 02 2017
Artificial Intelligence
Hidden Physics Models: Machine Learning of Nonlinear Partial Differential Equations
"Hidden physics models" are essentially data-efficient learning machines. They use time dependent and nonlinear partial differential equations to extract patterns from high-dimensional data. The proposed methodology may be applied to the problem of system identification.
0
0
0
Sat Dec 05 2020
Machine Learning
Data-based Discovery of Governing Equations
We propose a Data-based Physics Discovery (DPD) framework for automatic discovery of governing equations from observed data. The framework can utilize available prior physical models and domain expert feedback. We demonstrate the performance of the proposed framework on a real-world application in the aerospace industry.
0
0
0