Abstract: In software maintenance, bug reports play an important role for the correctness of software packages. Unfortunately, a duplicate bug report problem arises because there are signiﬁcant many duplicate bug reports in various software projects. Processing duplicate bug reports is thus time-consuming and has high cost of software maintenance. In this research, we propose a detection scheme based on the BM25 weighting and cluster shrinkage (BM25-CS) to enhance the detection performance. The effectiveness of this method is veriﬁed in an empirical study with three open-source projects, SVN, Argo UML, and Apache. The experimental results show that our method outperforms other detection schemes about 6-10% in all cases.
Keywords: Bug Reports, Duplication Detection, B25Weighting, Cluster Shrinkage
| DOI: 10.17148/IJARCCE.2018.71116