We found a match
Your institution may have rights to this item. Sign in to continue.
- Title
Imbalance factor: a simple new scale for measuring inter-class imbalance extent in classification problems.
- Authors
Pirizadeh, Mohsen; Farahani, Hadi; Kheradpisheh, Saeed Reza
- Abstract
Learning from datasets that suffer from differences in absolute frequency of classes is one of the most challenging tasks in the machine learning field. Efforts have been made to tackle the problem of class imbalance by providing solutions at data and algorithmic levels. In these cases, in order to categorize the solutions according to problem class imbalance level and to obtain meaningful and consistent interpretations from the experiments, it is essential to be able to quantify the extent of dataset imbalance. A competent scale to summarize the severity of data inter-class imbalance, requires to meet at least the following three conditions: (1) the ability to calculate the imbalance extent for both binary and multi-class datasets, (2) output within a definite and fixed range of values, (3) being correlated with the performance of different classifiers. Nevertheless, none of the scales introduced so far satisfy all the enumerated requirements. In this study, we propose an informative scale called imbalance factor (IF) based on information theory, which, independent of the number of data classes, quantifies dataset imbalance extent in a single value in the range of [0, 1]. Besides, IF offers various limiting cases with different growth rates according to its α order. This property is critical as it can settle the possibility of having the same extent for distinct distributions. Eventually, empirical experiments indicate that with an average correlation of 0.766 with the classification accuracies over 15 real datasets, IF is remarkably more sensitive to class imbalance changes than other previous scales.
- Subjects
INFORMATION theory; CLASSIFICATION; SKEWNESS (Probability theory)
- Publication
Knowledge & Information Systems, 2023, Vol 65, Issue 10, p4157
- ISSN
0219-1377
- Publication type
Article
- DOI
10.1007/s10115-023-01881-y