A ‘divide and conquer’ reject inference approach leveraging graph-based semi-supervised learning

Zongxiao Wu, Yizhe Dong*, Yaoyiran Li, Yaorong Liu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Many credit scoring studies suffer from potential sample selection bias due to their exclusive focus on accepted applicants. To address this issue, previous works have proposed reject inference (RI) strategies to first estimate the repayment ability of rejected applicants and then incorporate these estimates as additional supervision signals to refine their credit scoring models. However, existing RI methods often fail to effectively account for the local characteristics and default patterns inherent in diverse applicant groups. Our study introduces a novel ‘divide and conquer’ graph-based RI framework, named SAIL, that effectively captures inter-individual differences relevant to credit scoring. This framework comprises 1) Spectral clustering for the categorisation of accepted and rejected applicants, 2) isolation forests for identifying Anomalies among rejected cases, 3) Iterative relabelling mechanisms incorporating the Label spreading and self-learning algorithm for relabelling rejected samples, and 4) binary classification for the relabelled dataset. Using a unique loan dataset, we find that our proposed framework significantly enhances the efficacy of credit scoring models and outperforms other popular RI techniques in predicting defaults. Furthermore, our ablation studies confirm the crucial role of each component of our framework in enhancing prediction accuracy. Our work provides a comprehensive and adaptative RI framework for financial institutions to improve their loan decision-making and risk management.
Original languageEnglish
Article number114106
Pages (from-to)1-30
Number of pages30
JournalAnnals of Operations Research
Early online date10 May 2025
DOIs
Publication statusE-pub ahead of print - 10 May 2025

Keywords / Materials (for Non-textual outputs)

  • reject inference
  • credit scoring
  • graph theory
  • semi-supervised machine learning

Fingerprint

Dive into the research topics of 'A ‘divide and conquer’ reject inference approach leveraging graph-based semi-supervised learning'. Together they form a unique fingerprint.

Cite this