We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
基于Stacking-Bert 集成学习的 中文短文本分类算法.
- Authors
郑承宇; 王新; 王婷; 尹甜甜; 邓亚萍
- Abstract
Duo to the static word vector representation methods such as word2vec and Glove have problems such as incomplete representation of text semantics, and when the current mainstream neural network model is doing text classification problems, its prediction effect often depends on specific problems, the scene adaptability is poor, and the generalization ability is weak. To solve the above problems, a chinese short text classification method based on multi-base model framework named Stacking-Bert was proposed. Firstly, the model used the BERT pre-trained language model to represent text word vectors, and the deep feature information vector of the text is output. Then, the neural network models such as TextCNN, DPCNN, TextRNN, TextRCNN is used to construct a heterogeneous multi-base classifier, and obtain the text vector through Stacking integration learning Different feature information was expressed to improve the generalization ability of the model. Finally, the support vector machine was used as a meta-classifier model for training and prediction. Comparing experiments with text classification algorithms such as word2vec-CNN, word2vec-BiLSTM, BERT-texCNN, BERT-DPCNN, BERT-RNN, BERT-RCNN, etc. on three Chinese data sets published on the Internet, the results show that Stacking- Bert integrated learning. The model has the highest accuracy rate, precision rate, recall rate and F1 value, which can effectively improve the classification performance of chinese short texts.
- Publication
Science Technology & Engineering, 2022, Vol 22, Issue 10, p4033
- ISSN
1671-1815
- Publication type
Article