TY - JOUR
T1 - Applying hybrid machine learning algorithms to assess customer risk-adjusted revenue in the financial industry
AU - Machado, Marcos R.
AU - Karray, Salma
N1 - Funding Information:
The author acknowledges the support of the Natural Sciences and Engineering Research Council of Canada (NSERC) funding reference number RGPIN-2020-05156.
Publisher Copyright:
© 2022 Elsevier B.V.
PY - 2022/11/1
Y1 - 2022/11/1
N2 - A Peer-to-Peer (P2P) service is a decentralized platform that directly connects individuals, buyers (lenders) and sellers (investors) without the intermediation of a third party. In the P2P lending market, customer cash flows are undeniably linked to their financial risk of default. Thus, forecasting customers’ Risk-Adjusted Revenue (RAR) value is one of the most critical issues in financial decision-making. With the emergence of big data, traditional forecasting methods cannot provide the high predictive power needed for such metrics. We propose a hybrid method by integrating the use of supervised and unsupervised Machine Learning (ML) algorithms to enhance the accuracy of predicting customer-adjusted risk metrics. Using a real P2P dataset from the Lending Club, containing over two million cases, we forecast customers’ risk-adjusted revenue by applying ML algorithms for the first time. These include individual methods such as gradient boosting and decision trees, and hybrid frameworks that group customers using a clustering algorithm (k-Means or Density-Based Spatial Clustering of Applications with Noise (DBSCAN)) prior to implementing the individual methods. We compare the efficiency (processing time and accuracy) of this hybrid approach with the performance of individual regressor-based models to predict RAR. Our results indicate high predictive power for many individual ML algorithms (R2 score over 90%). Further, in most cases, hybrid models outperform the individual ones in both predictive performance and processing time. Finally, the feature importance analysis in the best predictive frameworks helps identify the most influential factors in predicting customers’ RAR in the P2P lending market.
AB - A Peer-to-Peer (P2P) service is a decentralized platform that directly connects individuals, buyers (lenders) and sellers (investors) without the intermediation of a third party. In the P2P lending market, customer cash flows are undeniably linked to their financial risk of default. Thus, forecasting customers’ Risk-Adjusted Revenue (RAR) value is one of the most critical issues in financial decision-making. With the emergence of big data, traditional forecasting methods cannot provide the high predictive power needed for such metrics. We propose a hybrid method by integrating the use of supervised and unsupervised Machine Learning (ML) algorithms to enhance the accuracy of predicting customer-adjusted risk metrics. Using a real P2P dataset from the Lending Club, containing over two million cases, we forecast customers’ risk-adjusted revenue by applying ML algorithms for the first time. These include individual methods such as gradient boosting and decision trees, and hybrid frameworks that group customers using a clustering algorithm (k-Means or Density-Based Spatial Clustering of Applications with Noise (DBSCAN)) prior to implementing the individual methods. We compare the efficiency (processing time and accuracy) of this hybrid approach with the performance of individual regressor-based models to predict RAR. Our results indicate high predictive power for many individual ML algorithms (R2 score over 90%). Further, in most cases, hybrid models outperform the individual ones in both predictive performance and processing time. Finally, the feature importance analysis in the best predictive frameworks helps identify the most influential factors in predicting customers’ RAR in the P2P lending market.
KW - Customer value prediction
KW - Hybrid frameworks
KW - Machine learning
KW - P2P
KW - Risk-adjusted revenue
KW - 22/4 OA procedure
UR - http://www.scopus.com/inward/record.url?scp=85140805072&partnerID=8YFLogxK
U2 - 10.1016/j.elerap.2022.101202
DO - 10.1016/j.elerap.2022.101202
M3 - Article
AN - SCOPUS:85140805072
SN - 1567-4223
VL - 56
JO - Electronic commerce research and applications
JF - Electronic commerce research and applications
M1 - 101202
ER -