CBRG: A Novel Algorithm for Handling Missing Data Using Bayesian Ridge Regression and Feature Selection Based on Gain Ratio

Samih M. Mostafa, Abdelrahman S. Eladimy, Safwat Hamad, Hirofumi Amano

    研究成果: ジャーナルへの寄稿学術誌査読

    8 被引用数 (Scopus)

    抄録

    Existing imputation methods may lead to biased predictions and decrease or increase the statistical influence which leads to improper estimations. Several missing value imputation approaches performance depends on the size of the dataset and the number of missing values within the dataset. In this work, the authors proposed a novel algorithm for manipulating missing data versus some common imputation approaches. The proposed algorithm imputes missing values in cumulative order depending on the gain ratio (GR) feature selection (to select the candidate feature to be manipulated) and the Bayesian Ridge Regression (BRR) technique (to build the predictive model). Each imputed feature will be used to manipulate the missing values in the following selected candidate feature. The proposed algorithm was implemented on eight different datasets after generating different missing values proportions from the missingness mechanisms. The imputation performance was calculated in terms of imputation time, mean absolute error (MAE), coefficient of determination (R2), and root-mean-square error (RMSE). The results show the efficiency of the proposed algorithm when imputing any dataset with any number of missing data from any missingness mechanism.

    本文言語英語
    論文番号9277540
    ページ(範囲)216969-216985
    ページ数17
    ジャーナルIEEE Access
    8
    DOI
    出版ステータス出版済み - 2020

    !!!All Science Journal Classification (ASJC) codes

    • コンピュータ サイエンス(全般)
    • 材料科学(全般)
    • 工学(全般)

    フィンガープリント

    「CBRG: A Novel Algorithm for Handling Missing Data Using Bayesian Ridge Regression and Feature Selection Based on Gain Ratio」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

    引用スタイル