Enhancing credit scoring with alternative data

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

Hundreds of millions of people in low-income economies do not have a credit or bank account because they have insufficient credit history for a credit score to be ascribed to them. In this paper, we evaluate the predictive accuracy of models using alternative data, that may be used instead of credit history, to predict the credit risk of a new account. Without alternative data, the type of data that is typically available is demographic data. We show that a model that contains email usage and psychometric variables, as well as demographic variables, can give greater predictive accuracy than a model that uses demographic data only and that the predictive accuracy is sufficiently high for the demographic and email data to be used when conventional credit history data is unavailable. The same applies if merely psychometric data is included together with demographic data. However, we show that different randomly selected training: test sample splits give a wide range of predictive accuracies. In the second part of the paper, using two datasets that include only email usage as a predictor, we compare the predictive performances of a wide range of machine learning and statistical classifiers. We find that some classifiers applied to these alternative predictors give sufficiently accurate predictions for these variables to be used when no other data is available.
Original languageEnglish
JournalExpert Systems with Applications
Early online date25 Jul 2020
Publication statusPublished - Jan 2021

Keywords / Materials (for Non-textual outputs)

  • credit scoring
  • alternative data
  • banking risk


Dive into the research topics of 'Enhancing credit scoring with alternative data'. Together they form a unique fingerprint.

Cite this