For a detailed description od theory and excel worbook example, refer to the attached link
https://drive.google.com/drive/folders/1vAcH8ge2KEFBPxxJt06cPgIr5XVKhyOr?usp=sharing
Risk Return Randomness Retrospective Reassessment
For a detailed description od theory and excel worbook example, refer to the attached link
https://drive.google.com/drive/folders/1vAcH8ge2KEFBPxxJt06cPgIr5XVKhyOr?usp=sharing
Huber M-estimation is a robust regression technique used to address the influence of outliers on model parameters. It is used to calculate CCF (Credit Conversion Factor) models for EAD (Exposure at Default).
Kernel change point detection method detects changes in the distribution of the data, not just changes in the mean or variance.
Kernel Method is utilized to map the data into a high-dimensional feature space, where changes are more easily detectable. This approach uses the Maximum Mean Discrepancy (MMD) to measure the difference between the distributions of segments of the time series.
Steps:
1- Data and Kernel Function: Consider a univariate time series {x1,x2,…,xn} We start by choosing a kernel function k(x,y) to measure similarity between points.
2- Construction of Kernel Matrix: kernel matrix K is constructed, where each element Kij=k(xi,xj)
For the linear kernel, this is: Kij=xi⋅xj (XTX)
3- Maximum Mean Discrepancy (MMD):
MMD measures how different two groups of data are by comparing the average of all pairwise similarities within each group and between the groups or compares two distributions to see if they are different.
MMD is used to measure the difference between the distributions before and after a candidate change point t.
For each candidate change point t
In the above equation:
- The first term measures the similarity within the first segment.
- The second term measures the similarity within the second segment.
- The third term measures the similarity between the two segments.
4- To detect the change point, we compute the MMD values are computed for all possible change points t and choose the one that maximizes the MMD value:
Excel Example :https://docs.google.com/spreadsheets/d/1IdC-ss1VjaL2QVQdABNwuIPfRphDtlZi/edit?usp=sharing&ouid=115594792889982302405&rtpof=true&sd=true
The BCR Approach (Modeling Low Default Portfolio):
Benjamin, Cathcart and Ryan proposed adjustments to the Pluto & Tasche or called Confidence Based Approach that is widely applied by banks.
Pluto& Tasche propose a methodology for generating grade level PDs by the most prudent estimation, the BCR approach concentrates on estimating a portfolio-wide PD that then apply to grade level estimates and result in capital requirements.
- Independent case: The conservative portfolio estimate as in the BCR setting is therefore given by the solution of (1).
- Dependent Case: Assumed that there is a single normally distributed risk-factor Y to which all assets are correlated and that each asset has the correlation √ρ with the risk factor Y.
For a given value of the common factory=Y the conditional probability of default given a realization of the systematic factor Y is given by (2). The probability of default is equivalent to finding p such that the above is true.
- Multi-Period: Multi-period case:
The, the conditional probability of default given a realization of the systematic factor for t years as in the Vasicek model (3)
Estimation Method:
Steps of Execution:
1- Draw N samples from N(λ,Σ) where λ is a zero vector with the same length as the time period and Σ is the correlation matrix as in the Vasicek model.
2- Equation (4)
3- Find p such that f(p) is close to zero using the following iteration:
- Set the number of iterations:
n = log2((phigh−plow)/δ) where [plow, phigh]is the interval p is believed to be in and δ is the accepted error.
- For n number of iterations, the midpoint, pmid, of the interval is calculated. It is then checked if f(pmid)>0 or<0.
If it’s the first case the lower bound is set equal to the midpoint, in the second case the higher bound is set equal to the midpoint.
- When the n iterations are done the estimated probability of default is set to the final midpoint
Excel (Macro enable workbook) and VBA code:
https://drive.google.com/drive/folders/18tIKLLg8MfJ2MYDjLPAWHspEbApVIpGz?usp=sharing
PCA: EigenValues & EigenVectors:
Eigen Values & Vectors-
Eigen Value- a scalar associated with a given linear transformation of a vector space and having the property that there is some nonzero vector which when multiplied by the scalar is equal to the vector obtained by letting the transformation operate on the vector; especially: a root of the characteristic equation of a matrix.
Eigenvector or characteristic vector of a linear transformation is a non-zero vector whose direction does not change when that linear transformation is applied to it.
Let A an n*n matrix. The number x is an eigenvalue of A if there exists a non-zero vector v such that.
Av= xv
In this case, vector v is called an eigenvector of A corresponding to eigen value .
Rewrite the condition: Av= xv as
(A− xI)*v= 0 (E.1)
Where I am the n n identity matrix.
For a non-zero vector v to satisfy this equation, A− xI must not be invertible. If it is invertible than v = 0.
The Characteristic Polynomial =
p( x )=det (A− xI) (E.2)
Roots of the above equation will give us the eigen values x .
Substituting x in E.2 will give us the corresponding vectors.
- Eigenvalues represent the variance of the data along the eigenvector directions, whereas the variance components of the covariance matrix represent the spread along the axes.
- All the eigen vectors of a matrix are perpendicular, i.e. at right angles to each other.
- The eigen vector of covariance matrix with the highest eigen value is the principle component of the data set.
- The largest eigenvector of the covariance matrix always points into the direction of the largest variance of the data, and the magnitude of this vector equals the corresponding eigenvalue.
- The second largest eigenvector is always orthogonal to the largest eigenvector, and points into the direction of the second largest spread of the data.
- If the covariance matrix of our data is a diagonal matrix, such that the covariances are zero, then this means that the variances must be equal to the eigenvalues x .
Refer for Details:
https://drive.google.com/drive/folders/19IEg3V008AkBwT5UGrsXCL8N6It7BGc_?usp=sharing
Python Generic Code (Probability of Default Model Development, Validation and Testing):
In the link below:
1- PD_Factors.csv: CSV with factors and required data
2- PD_Model_Generic_Python (Doc and PDF): Python generic code (as in steps)
3- PD_Estimate_Steps_Python: Steps in word as in Screenshot below
Errors of a model is decomposed into Noise, Bias and variance.
For a detailed description od theory and excel worbook example, refer to the attached link https://drive.google.com/drive/folders/1vAcH8ge...
Huber M-estimation uses a loss function (1) that transitions from squared error to absolute error depending on a threshold 𝛿:
δ based on the expected distribution of residuals.
e.g. δ=m * Stdev of errors
- If ∣ri∣≤m * Stdev, weight wi=1
- If ∣ri∣>m * Stdev the weight wi= m * Stdev / ∣ri∣
Steps:
Y (CCF) =β0+ β1X+ ϵ,
fit the Ordinary Least Squares (OLS) regression:
β^=(X^TX)^−1 * X^TY
residuals ri=yi−y^i
Define the Huber loss function (1) to calculate weights.
Using weights, modify the regression:
β^=(X^T*W*X)^−1X^T*W*Y