Abstract / Description of output
This paper develops a method for modelling binary response data in a regression model with highly unbalanced class sizes. When the class sizes are highly unbalanced and the minority class represents a rare event, conventional regression
analysis, i.e. logistic regression models, could underestimate the probability of the rare event. To overcome this drawback, we introduce a flexible skewed link function (Calabrese and Osmetti, 2013) based on the quantile function of the generalized extreme value (GEV) distribution in a generalized additive model (GAM). The proposed model is known as Generalized extreme value additive (GEVA) regression model and a modified version of the local scoring algorithm is suggested to estimate it. We apply the proposed model to a data set on Italian small and medium enterprises (SMEs) to estimate the default probability of SMEs. Our proposal performs better than the logistic (linear or additive) model in terms of predictive accuracy.
analysis, i.e. logistic regression models, could underestimate the probability of the rare event. To overcome this drawback, we introduce a flexible skewed link function (Calabrese and Osmetti, 2013) based on the quantile function of the generalized extreme value (GEV) distribution in a generalized additive model (GAM). The proposed model is known as Generalized extreme value additive (GEVA) regression model and a modified version of the local scoring algorithm is suggested to estimate it. We apply the proposed model to a data set on Italian small and medium enterprises (SMEs) to estimate the default probability of SMEs. Our proposal performs better than the logistic (linear or additive) model in terms of predictive accuracy.
Original language | English |
---|---|
Journal | Journal of Forecasting |
Publication status | Published - 14 Feb 2015 |