Errors of a model is decomposed into Noise, Bias and variance.
There is a tradeoff between a model's ability to minimize bias and variance.
Overfitting: Variance High; model is good in the training set but not in the testing data set. Low training error does not imply good expected performance: over‐fitting.
Underfitting: Bias High; Model is neither good in the training nor in the testing data.
Noise: The model is neither overfitting or underfitting, and the high MSE is simply due to the amount of noise in the dataset.
Error due to Bias: Actual Value – average (Predicted Value).
A high bias model characteristic:
1. High training error.
2. Validation error is similar in magnitude to the training error.
Error due to Variance:
Is variability of a model prediction for a given data point. Repeating the entire model building process multiple times. The variance is how much the predictions for a given point vary between different realizations of the model.
A high variance model characteristic:
1. Low training error
2. Very high Validation error
No comments:
Post a Comment