We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Landmark-Based Domain Adaptation and Selective Pseudo-Labeling for Heterogeneous Defect Prediction.
- Authors
Chen, Yidan; Chen, Haowen
- Abstract
Cross -project defect prediction (CPDP) is a promising technical means to solve the problem of insufficient training data in software defect prediction. As a special case of CPDP, heterogeneous defect prediction (HDP) has received increasing attention in recent years due to its ability to cope with different metric sets in projects. Existing studies have proven that using mixed-project data is a potential way to improve HDP performance, but there remain several challenges, including the negative impact of noise modules and the insufficient utilization of unlabeled modules. To this end, we propose a landmark-based domain adaptation and selective pseudo-labeling (LDASP) approach for mixed-project HDP. Specifically, we propose a novel landmark-based domain adaptation algorithm considering marginal and conditional distribution alignment and a class-wise locality structure to reduce the heterogeneity between both projects while reweighting modules to alleviate the negative impact brought by noise ones. Moreover, we design a progressive pseudo-label selection strategy exploring the underlying discriminative information of unlabeled target data to further improve the prediction effect. Extensive experiments are conducted based on 530 heterogeneous prediction combinations that are built from 27 projects using four datasets. The experimental results show that (1) our approach improves the F1-score and AUC over the baselines by 9.8–20.2% and 4.8–14.4%, respectively and (2) each component of LDASP (i.e., the landmark weights and selective pseudo-labeling strategy) can promote the HDP performance effectively.
- Subjects
MARGINAL distributions; FORECASTING; SECURE Sockets Layer (Computer network protocol); PROBLEM solving
- Publication
Electronics (2079-9292), 2024, Vol 13, Issue 2, p456
- ISSN
2079-9292
- Publication type
Article
- DOI
10.3390/electronics13020456