A Comprehensive Review of Evolutionary Sampling Techniques for Addressing Data Quality Problems in Imbalanced Data Classification

Tracking #: 893-1873


Responsible editor: 

Robert Hoehndorf

Submission Type: 

Survey Paper

Abstract: 

With the rapid expansion of data, particularly in the form of data banks, numerous challenges have arisen, among which the issue of imbalanced data has become increasingly prominent. Generally, three main approaches are used to address imbalanced data, i.e., approaches at data-level, algorithm-level, and hybrid of both levels. The data-level approach, also known as sampling techniques, is widely adopted because the approach does not depend on specific classifier. Evolutionary compu- tation has become a popular method in the sampling process, referred to as evolutionary sampling techniques, as has been effectively proven in various optimization tasks. Also, the imbalanced data issues are often related to data quality problems, such as noise and class overlapping. However, to the best of our knowledge, no survey has been performed that focused on evolutionary sampling techniques, particularly for handling noise and class overlapping problems. Hence, this paper presents a systematic literature review, offering a comprehensive discussion on evolutionary sampling techniques that focus on addressing noise and class overlapping problems. This survey identifies key challenges and opportunities, guiding future advancements in handling imbalanced data with evolutionary sampling techniques.

Manuscript: 

Previous Version: 

Tags: 

  • Under Review

Data repository URLs: 

Date of Submission: 

Monday, December 30, 2024


Nanopublication URLs: