Abstract: In software maintenance, bug reports play an important role for the correctness of software packages. Unfortunately, a duplicate bug report problem arises because there are significant many duplicate bug reports in various software projects. Processing duplicate bug reports is thus time-consuming and has high cost of software maintenance. In this research, we propose a detection scheme based on the BM25 weighting and cluster shrinkage (BM25-CS) to enhance the detection performance. The effectiveness of this method is verified in an empirical study with three open-source projects, SVN, Argo UML, and Apache. The experimental results show that our method outperforms other detection schemes about 6-10% in all cases.

Keywords: Bug Reports, Duplication Detection, B25Weighting, Cluster Shrinkage

PDF | DOI: 10.17148/IJARCCE.2018.71116

