Enhancing two-stage modelling methodology for loss given default with support vector machines

Xiao Yao, Jonathan Crook, Galina Andreeva

Research output: Contribution to journalArticlepeer-review


We propose to incorporate least squares support vector machine technique into a two-stage modelling framework to predict recovery rates of credit cards from a UK retail bank. The two-stage model requires a classification step that discriminates the cases with recovery rate equal to 0 or 1 and a regression step to estimate recovery rates for the cases with recovery rates in (0, 1). The two-stage model with a support vector machine classifier is found to be advantageous on an out-of-time sample compared with other methods, suggesting that a support vector machine is preferred to a logistic regression as the classification technique. We further examine the predictive performances on a subset where recovery rate is bounded in (0, 1) and the empirical evidence demonstrates that support vector regression yields significant but modest improvement compared with other statistical regression models. When modelling on the whole sample, the support vector regression does not present any advantage compared with other techniques within the two-stage modelling framework. We suggest that the choice of regression models is less influential in prediction of recovery rates than the choice of classification methods in the first step of two-stage models.
Original languageEnglish
Pages (from-to)679-689
JournalEuropean Journal of Operational Research
Issue number2
Early online date17 May 2017
Publication statusPublished - 1 Dec 2017


  • risk analysis
  • loss given default modelling
  • two-stage model
  • support vector machine


Dive into the research topics of 'Enhancing two-stage modelling methodology for loss given default with support vector machines'. Together they form a unique fingerprint.

Cite this