Abstract: Zambia Revenue Authority (ZRA) generates large volumes of data that need complex mechanisms in order to extract useful tax information. The purpose of the study was to develop a data mining model for detection of fraud on tax and taxpayer data for ZRA. This study focused on two areas. These were (1) the baseline study that helped to establish the extent of the challenges in fraud detection for the tax payers and (2) the automation and development of the fraud detection tool using the results from the baseline study. Our baseline study showed that the current methodologies, processes, architectures, and technologies that were being used to transform raw data into meaningful and useful information were tedious and time consuming. In order to detect fraud they depended on random audits, informants and under-cover operations. A model which implements outlier algorithms for fraud detection, Continuous Monitoring of Distance Based and Distance Based Outlier Queries was then developed. We used both algorithms to analyse the domestic tax payments to detect underpayments and overpayments according to business rules. Underpayments and overpayments are marked as outliers. Results generated by our tool showed improved accuracy and takes less time in order to detect under and over payments as outliers when compared to the older methods.

Keywords: Business Intelligence, Data mining, fraud detection, outlier algorithm.