Effective Pre-processing of Genetic Programming for Solving Symbolic Regression in Equation Extraction

Research output: Contribution to journalArticle

Abstract

Estimating a form of equation that explains data is very useful to understand various physical, chemical, social, and biological phenomena. One effective approach for finding the form of an equation is to solve the symbolic regression problem using genetic programming (GP). However, this approach requires a long computation time because of the explosion of the number of combinations of candidate functions that are used as elements to construct equations. In the present paper, a novel method to effectively eliminate unnecessary functions from an initial set of functions using a deep neural network was proposed to reduce the number of computations of GP. Moreover, a method was proposed to improve the accuracy of the classification using eigenvalues when classifying whether functions are required for symbolic regression. Experiment results showed that the proposed method can successfully classify functions with over 90{\%} of the data created in the present study.
Original languageEnglish
Pages (from-to)89-103
JournalCommunications in Computer and Information Science
Volume1040
Publication statusPublished - 2019

Fingerprint

Symbolic Regression
Genetic programming
Genetic Programming
Preprocessing
Processing
Explosion
Explosions
Eliminate
Classify
Neural Networks
Eigenvalue
Experiment
Experiments

Cite this

@article{ff2cdcca612c4643928984684160c5f8,
title = "Effective Pre-processing of Genetic Programming for Solving Symbolic Regression in Equation Extraction",
abstract = "Estimating a form of equation that explains data is very useful to understand various physical, chemical, social, and biological phenomena. One effective approach for finding the form of an equation is to solve the symbolic regression problem using genetic programming (GP). However, this approach requires a long computation time because of the explosion of the number of combinations of candidate functions that are used as elements to construct equations. In the present paper, a novel method to effectively eliminate unnecessary functions from an initial set of functions using a deep neural network was proposed to reduce the number of computations of GP. Moreover, a method was proposed to improve the accuracy of the classification using eigenvalues when classifying whether functions are required for symbolic regression. Experiment results showed that the proposed method can successfully classify functions with over 90{\{\%}} of the data created in the present study.",
author = "Kenji Ono",
year = "2019",
language = "English",
volume = "1040",
pages = "89--103",
journal = "Communications in Computer and Information Science",
issn = "1865-0929",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - Effective Pre-processing of Genetic Programming for Solving Symbolic Regression in Equation Extraction

AU - Ono, Kenji

PY - 2019

Y1 - 2019

N2 - Estimating a form of equation that explains data is very useful to understand various physical, chemical, social, and biological phenomena. One effective approach for finding the form of an equation is to solve the symbolic regression problem using genetic programming (GP). However, this approach requires a long computation time because of the explosion of the number of combinations of candidate functions that are used as elements to construct equations. In the present paper, a novel method to effectively eliminate unnecessary functions from an initial set of functions using a deep neural network was proposed to reduce the number of computations of GP. Moreover, a method was proposed to improve the accuracy of the classification using eigenvalues when classifying whether functions are required for symbolic regression. Experiment results showed that the proposed method can successfully classify functions with over 90{\%} of the data created in the present study.

AB - Estimating a form of equation that explains data is very useful to understand various physical, chemical, social, and biological phenomena. One effective approach for finding the form of an equation is to solve the symbolic regression problem using genetic programming (GP). However, this approach requires a long computation time because of the explosion of the number of combinations of candidate functions that are used as elements to construct equations. In the present paper, a novel method to effectively eliminate unnecessary functions from an initial set of functions using a deep neural network was proposed to reduce the number of computations of GP. Moreover, a method was proposed to improve the accuracy of the classification using eigenvalues when classifying whether functions are required for symbolic regression. Experiment results showed that the proposed method can successfully classify functions with over 90{\%} of the data created in the present study.

M3 - Article

VL - 1040

SP - 89

EP - 103

JO - Communications in Computer and Information Science

JF - Communications in Computer and Information Science

SN - 1865-0929

ER -