We found a match
Your institution may have rights to this item. Sign in to continue.
- Title
Reducing Training Spaces in Cluster-Based Data Balancing Ensemble Learning.
- Authors
Santoso, Judhi; Yulianti, Lenny Putri; Surendro, Kridanto; Trisetyarso, Agung
- Abstract
Ensemble learning incorporates the predictions of several learners to achieve better performance. The approach most widely used to build it is random subspace generation, where several well-known methods that use random subspace include bagging, boosting, clustering, and, more recently, clustering balancing. Clustering balancing has the potential to create a more accurate and diverse ensemble by generating a large pool of diverse subsets of data. However, involving all the balanced clusters produced for training learners carries the risk of creating similarly trained learners because there is a possibility of similar balanced clusters being produced through oversampling. This not only wastes computational time but also biases the ensemble towards their output. One strategy to handle this issue is by selecting the optimum set of balanced clusters with minimum similarity. Many such similarity indices were introduced to compare clusters from perturbed datasets, but there is a lack of comparison of these measures to select the optimum clusters. This study contributes to handling this issue with the aims to (i) analyze and identify the best similarity index and (ii) design a new approach to reducing the similarity of training spaces in cluster-based data balancing ensemble learning. The proposed approach using nine similarity indices was evaluated on 20 datasets from the UCI repository based on ensemble sizes and accuracy. The results show that the Normalized Mutual Information, Jaccard, and Van Dongen indices could serve as alternatives for measuring similarity in training spaces in ensemble learning. The Normalized Mutual Information employs a probabilistic approach, while the others utilize deterministic approach.
- Subjects
INSTITUTIONAL repositories; POSSIBILITY; FORECASTING; MEASUREMENT; DESIGN
- Publication
International Journal on Electrical Engineering & Informatics, 2024, Vol 16, Issue 2, p211
- ISSN
2085-6830
- Publication type
Article
- DOI
10.15676/ijeei.2024.16.2.5