One shortcoming in the existing models for credit scoring is the assumption that all borrowers are at-risk, i.e. they must default in the long run. With this assumption, the models’ performance gets skewed in the sense that the probability of default for “good” borrowers is overestimated. To overcome this problem, the long-term survival mixture model is proposed, which is an extension of the ordinary survival model. In particular, it assumes that only a fraction of the borrowers are at-risk (such fraction is forced to be 100% under the ordinary survival model) and the remaining are “risk-free”. The term “risk-free borrower” does not necessarily refer to a debtor who never defaults, but the one who does not default for a sufficiently long period. In this setting, the probability of being at-risk is modelled via a logistic regression and the time-to-default (for the at-risk group) is modelled under the survival analysis framework with the baseline hazard following a Weibull distribution. With German credit data, the proposed ‘survival mixture model’ is compared with the ‘Cox Proportional Hazards model’, Weibull survival model and logit model by means of C-Statistic, which is the estimated area under the ROC (Receiver Operating Characteristic) curve. The ‘survival mixture model’ shows better, or at least comparable, performance. Simulation study is carried out to investigate the applicability of the mixture model in various situations. It is found that the performance of the estimators is generally acceptable. The survival mixture model not only estimates the regression coefficients in the hazard function, but also predicts the probability of being at-risk. It provides additional information about the borrowers’ default risk, which assists the lending institutions to better manage** credit risk**.

*Contents*

1. Introduction

1.1. Background

1.2. Motivations

1.3. Study Objectives

1.4. Outline

2. Literature Review

2.1. Classification Models

2.1.1. Comparisons of **Classification Models**

2.1.2. Special Features

2.2. Survival Model

2.2.1. Descriptive Methods of Time-to-event

2.2.2. Survival Analysis with Covariates

2.2.3. Comparison with Logistic Regression

3. **Survival Mixture Model**

3.1. Difference with Ordinary Survival Model

3.2. Estimation of Risk-free Proportion

3.3. Statistical Test for Boundary Hypothesis

4. Estimation Methods

4.1. Newton Raphson Iteration

4.2. Estimation of Asymptotic Variances

5. Consumer Credit Data

6. Empirical Study

6.1. Analysis without covariates

6.2. Model Building

6.3. Models Comparison in terms of C Statistic

6.4. Application of Models

7. Simulation

8. Concluding remarks

Appendix

Author: Mo, Shek Fung

Source: City University of Hong Kong

Download URL 2: Visit Now