We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
针对不平衡数据的过采样和随机森林改进算法.
- Authors
张家伟; 郭林明; 杨晓梅
- Abstract
To solve the problem of low recognition rate for minority samples due to imbalanced data, an improved algorithm based on weighted oversampling and random forest is proposed to reduce the influence of imbalanced data on classifier. In data preprocessing step, weighted oversampling based on Synthetic Minority Oversampling Technique (SMOTE)is applied to reduce the data imbalanced rate. Weights are determined by the Euclidean distance between each sample and the rest in minority class, new samples with different number are generated by weighting samples of minority class. To improve the random forest, Kappa coefficient is used to evaluate the classification performance of decision tree, and corresponding weight is given to each tree. It makes trees with better performance having more voting rights at final voting stage. Experiments on KEEL datasets show that the proposed algorithm improves the classification accuracy for minority samples and the classification performance of the imbalanced datasets compared with unimproved algorithm.
- Publication
Journal of Computer Engineering & Applications, 2020, Vol 56, Issue 11, p39
- ISSN
1002-8331
- Publication type
Article
- DOI
10.3778/j.issn.1002-8331.1908-0338