Cumulative bayesian ridge for handling missing data

Tracking #: 646-1626

Authors:
NameORCID
Samih M. MostafaORCID logo https://orcid.org/0000-0001-9234-5898
Abdelrahman SaleemORCID logo https://orcid.org/0000-0003-3254-0872
Safwat Hamad


Submission Type: 

Research Paper

Abstract: 

Old approaches that manipulate missing values may lead to biased estimations. Besides, they may also decrease or magnify the statistical influence, which could result in unacceptable conclusions. The performance of various missing value imputation algorithms may depend on the amount of the missing values in the dataset and the dataset’s dimension. In this paper, the authors proposed a new algorithm for handling missing data against some registered practical imputation methods. The proposed algorithm depends on the Bayesian Ridge technique, which operates in a cumulative order with the aid of gain ratio feature selection to select the candidate feature to be imputed. The imputed feature will be included in the Bayesian Ridge equation to impute missing values in the next candidate feature. Here, the authors are attempting to choose the best imputation method succeeded to give high imputation accuracy with less imputation time. Finally, we applied the proposed algorithm on eight datasets with various missing values proportions generated from the missingness mechanisms. The empirical study indicates the effectiveness of the proposed algorithm with any missingness mechanism and with any missing data percentage.

Manuscript: 

Previous Version: 

Tags: 

  • Reviewed

Data repository URLs: 

Date of Submission: 

Sunday, July 5, 2020

Date of Decision: 

Tuesday, July 7, 2020


Nanopublication URLs:

Decision: 

Reject (Pre-Screening)