We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
基于密度 Canopy的评论文本主题识别方法.
- Authors
刘 滨; 詹世源; 刘 宇; 雷晓雨; 杨雨宽; 陈伯轩; 刘格格; 高 歆; 皇甫佳悦; 陈 莉
- Abstract
The method, which combines Sentence-BERT and LDA, takes the topic number of LDA as the k value in K-means algorithm, resulting in poor interpretability and low topic consistency. To solve this problem, a Sentence-BERT and LDA optimization method based on density Canopy(SBERT-LDA-DC) was proposed, which used density Canopy to improve the K-means algorithm. The experimental results indicate that this method is superior to similar methods using K-means and K-means++ to cluster feature vectors on the consistency index. Compared with the SBERT-LDA method, the consistency index is improved by 22.9% on the 1 852 drama comment dataset. The proposed SBERT-LDA-DC method is effective, which provides a new method for product or service providers to better understand user opinions and improve their own products or services, and has strong practical application value.
- Subjects
NATURAL language processing
- Publication
Journal of Hebei University of Science & Technology, 2023, Vol 44, Issue 5, p493
- ISSN
1008-1542
- Publication type
Article
- DOI
10.7535/hbkd.2023yx05008