We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Online active learning method for multi-class imbalanced data stream.
- Authors
Li, Ang; Han, Meng; Mu, Dongliang; Gao, Zhihui; Liu, Shujuan
- Abstract
In the field of data mining, data stream classification is an important research direction. However, the presence of issues such as multi-class imbalance, concept drift, and variable class imbalance ratio in data streams can greatly impact the performance of classification models, and the high cost of sample labeling has always been a focus of research. To address these problems, an online active learning method for multi-class imbalanced data stream (OALM-MI) is proposed. Firstly, a comprehensive sample weighting method based on cross-entropy and margin values is proposed to weight each incoming sample in the data stream according to its classification difficulty and importance, which aims to enhance the learning ability of the classifier for important samples. Besides, a comprehensive weighting and updating strategy for ensemble classifiers is introduced, which combines mean square error, improved square error, recall, and the weights of the classifiers in the previous sliding window of samples to weight and update the classifiers. Additionally, adaptive window is utilized to detect and handle concept drift, enabling better adaptation to the changes in the data stream during the learning process. Finally, a margin matrix label request strategy based on class imbalance ratio is proposed to assign labels to samples according to their imbalance ratio and classification difficulty, which can provide more learning opportunities for minority class samples and important samples. Comprehensive experiments were conducted on 12 synthetic data streams and six real data streams with seven state-of-the-art algorithms, and the results showed that the OALM-MI algorithm achieved the highest performance in terms of recall, precision, F1-score, Kappa, and G-mean.
- Subjects
ACTIVE learning; ONLINE education; LEARNING ability; CROSS-entropy method; DATA mining
- Publication
Knowledge & Information Systems, 2024, Vol 66, Issue 4, p2355
- ISSN
0219-1377
- Publication type
Article
- DOI
10.1007/s10115-023-02027-w