Abstract: In the contemporary era, characterized by the rapid expansion of the internet, social media, and mobile communication, substantial amounts of new data, sometimes called ‘big data’, are being generated every day. Machine Learning (ML) which is one important technique of AI, has the ability to extract significant information from big data. The financial industry continues to invest in machine learning models to better utilize big data. The most exciting Iceberg of big data occurs in the ‘pay-as-you-go’ market such as peer-to-peer lending platforms, where most data is generated by borrowers. Low Credit Scoring (CS) has been a critical problem for many individuals and small-sized businesses in emerging markets under financial exclusion. Traditional financial institutions rely heavily on fixed and well-structured information, restricting many creditworthy applicants from financial products. Peer-to-peer lenders often lower the entry barrier by adopting models on new data sources in the short-term, considering process efficiency. However, a significant percentage of applicants with no records on the platforms would not be able to access credit. The home-grown online lenders who best incorporate big data and machine learning are well positioned to succeed.

In this work, a bootstrapping ensemble voting model was developed combining traditional credit scoring statistics with new data sources along with machine learning and ensemble techniques, which is proven to be capable of answering the inquiry well. Exploring more discriminative local data sources by clustering the online lending market and attention mechanisms could be future research agendas. Despite recent progress, credit scoring in peer-to-peer lending remains an open topic, and exploratory research is a rewarding direction. New localised lending patterns, data sources, and variables on credibility scoring for different platforms or markets deserve more attention, both in terms of theory and application.
However, the problem still exists. An increasing amount of general and unstructured big data have the potential to yield actionable insights but requires extensible AI-powered platform solutions to efficiently aggregate, normalize, transform, and apply the data. In emerging markets, AI-powered credit scoring has traditionally been a luxury enjoyed only by wealthy groups and a specific number of well-known companies, limiting its extensive applications to the majority of people in need. Substantial investments and over-engineered solutions disallow small financial institutions to step in. In most cases, data themselves are not valid and informative, and lack transparency in terms of matching or separating. In addition, validation and explanation are very hard to obtain.

Keywords: Machine Learning for Credit Scoring, AI in Financial Inclusion, Big Data Credit Assessment, Alternative Credit Scoring Models, Predictive Analytics in Finance, Non-Traditional Data Sources, AI Credit Risk Modeling, Financial Behavior Analysis, Digital Lending Algorithms, Fair and Explainable AI in Credit, Credit Scoring for the Unbanked, Behavioral Credit Scoring, AI-Driven Risk Assessment, Open Banking Credit Models, Data-Driven Lending Solutions.


PDF | DOI: 10.17148/IJARCCE.2023.121226

Open chat
Chat with IJARCCE