Unleashing the power of text for credit default prediction: Comparing human-generated and AI-generated texts

Zongxiao Wu, Yizhe Dong, Yaoyiran Li, Baofeng Shi

Research output: Contribution to conferencePaperpeer-review

Abstract / Description of output

Despite the increasing utilization of generative Large Language Models (LLMs) across various domains within the financial world, their potential applications in lending decision-making remain largely unexplored. This study explores and sheds light on the potential benefits of integrating LLMs, specifically ChatGPT and BERT, into the lending decision-making process, with a focus on leveraging textual information for default prediction. We utilize ChatGPT to analyze and interpret textual information generated by loan officers, enabling us to generate human-like textual loan assessments. We then compare these AI-generated loan textual assessments with the original assessments and observe a noticeable distinction between the two types of textual assessments in terms of length, semantic similarity and linguistic patterns. By employing deep learning techniques, we find that along with conventional structured data, the inclusion of unstructured text data, particularly ChatGPT-generated text, can significantly improve credit default predictions. We also find that ChatGPT’s analysis of the borrower's delinquency factors contributes the most to the predictive capacity and performance of the model. Overall, the results in our study suggest that integrating state-of-the-art LLMs and unstructured data into the credit risk assessment process can lead to more accurate predictions of defaults.
Original languageEnglish
Publication statusPublished - 17 Apr 2023

Keywords / Materials (for Non-textual outputs)

  • Generative AI
  • ChatGPT
  • Natural Language Processing
  • Text mining
  • Default prediction


Dive into the research topics of 'Unleashing the power of text for credit default prediction: Comparing human-generated and AI-generated texts'. Together they form a unique fingerprint.

Cite this