Friday, August 1, 2025

PD Model Development in Python:

 Python code and data link:

https://drive.google.com/drive/folders/1kC621QtmjG3C_2ok-I53fPYqDRf9r8RK?usp=sharing

A. Data Preparation

  • Import & Clean Data: Read factor data with ratings and defaults; handle missing values.

  • Target Variable Setup: Calculate yearly default rates and define the target (Default flag).

  • Data Split: Train-test split for robust model validation.

 B. Statistical Screening of Independent Variables

  • Stability Check (PSI): Ensure variable stability over time.

  • Discriminatory Power (KS-Stat): Select variables that distinguish well between default and non-default.

  • Predictive Power (IV & WoE): Retain only those with high predictive value.

  • Stationarity (ADF Test): Remove non-stationary series.

  • Multicollinearity (VIF): Drop highly correlated variables.

  • Partial Correlation: Remove redundant/confounding variables.

 C. Logistic Regression & Model Construction

  • Stepwise Logistic Regression: Based on p-values (< 0.01), build the core model.

  • PD Estimation: Generate scores and PD predictions with monotonicity checks.

  • Diagnostics: Autocorrelation (Durbin-Watson) and Heteroskedasticity (Breusch-Pagan) tests ensure statistical robustness.

D. Model Testing

  • Rating Assignment: Cluster PD outputs into buckets using K-means for interpretability.

  • Validation Tests:

    • Jeffreys Test and KS-Stat – Compare predicted vs actual default distributions.

E. Final Model Validation

  • Accuracy Checks: AUC-ROC, F1, Recall, Precision, Log Loss across Train/Test sets.

  • Cross-Validation: K-Fold CV for model generalization.

  • Regularization Checks:

    • Lasso Regression – Identifies non-contributing features.

    • Ridge Regression – Tests coefficient stability.

  • Model Comparison: Combine and review coefficients from Logit, Lasso, and Ridge models.

This pipeline balances statistical rigor with regulatory expectations, providing a ready-to-explain model for auditors, regulators, and internal committees. It’s a great base for both Basel and IFRS9/CECL-aligned PD model builds.

No comments:

R3 chase - Pursuit

Import Macro Data from MOSPI into Python:

  Step 1 — Capture the Download URL (One-Time Setup) Open the MOSPI page: https://esankhyiki.mospi.gov.in/ Search for your dataset (CP...