Cumulative bayesian ridge for handling missing data

Tracking #: 632-1612

Samih M. MostafaORCID logo
Abdelrahman SaleemORCID logo
Safwat Hamad

Submission Type: 

Research Paper


Old approaches for functioning with missing values may drive to biased estimations and may also decrease or magnify statistical influence. To each of these misrepresentations may drive to unacceptable conclusions. The conduct of diverse missing value imputation algorithms may fluctuate for dissimilar datasets and may lean on the amount of the missing values in the dataset and the dataset’s dimension. In, this paper the authors proposed a new algorithm for handling missing data and implement an unbiased study of some registered practical imputation methods used for handling missing value. The suggested algorithm is based on the Bayesian Ridge technique works in a cumulative order with a gain ratio feature selection in its kernel to select the candidate feature to be imputed; any imputed feature will be incorporated in Bayesian Ridge equation to impute missing values in the next chosen feature. If a missing value gives high imputation precision and requires less imputation time is considered better. On examining eight datasets with several missing values proportions formed from the three mechanisms, it was observed that the performances be different depending on the Missingness mechanism, size, and missing proportion.


Supplementary Files (optional): 


  • Reviewed

Data repository URLs: 

Date of Submission: 

Friday, May 1, 2020

Date of Decision: 

Monday, May 4, 2020


Reject (Pre-Screening)