반응형
Overfitting and Underfitting
- Overfitting refers to cases where a model performs well on the training set but shows poor performance on the validation set.
- Underfitting occurs when there isn't big difference between training and validation set performance, but both show poor performance.
Learning Curves
- Learning curve: Graph showing the model's learning progress
- Learning curve at left is a typical graph of underfitted model.
- While the performance gap between training and validation sets gradually narrows, the overall performance remains low.
- This is why an underfitted model is also said to have high bias.
- Underfitting occurs when a model is not complex enough to capture all the patterns present in the training data.
- One of the most popular methods to solve underfitting is to use more complex model or relax weight regularization.
- Learning curve at top right is a typical graph of overfitted model.
- There is a large gap between the performance measured in the training set and validation set.
- That is why we also say that an overfitted model has high variance.
- One of the main causes of overfitting is when the training set doesn't include samples with sufficiently diverse patterns.
- Because the training set lacks diverse pattern samples, it fails to properly adapt to the validation set.
- In such cases, you can improve validation set performance by collecting more training samples.
- However, there may be cases where collecting more training samples is not feasible due to practical limitations.
- In these situations, we can restrict the weights to prevent the model from overfitting to the training set.
- This is also referred to as reducing the model's complexity.
- Learning curve at bottom right shows a balance between overfitting and underfitting.
Epoch-Loss graph for analyzing overfitting & underfitting
We can also utilize the loss function over epochs to analyze overfitting and underfitting.
- The right graph shows the loss for validation and training sets.
- While the training set loss decreases as epochs progress, the validation set loss actually increases after passing the optimal point marked by the red dotted line.
- This is because if we continue training the model with the training set after the optimal point, the model fits more tightly to the training set samples, indicating the start of overfitting.
- Conversely, before the optimal point, the losses of both training and validation sets decrease while maintaining a similar gap.
- If training is stopped in this region, it results in an underfitted model.
- The left graph uses accuracy instead of loss on the vertical axis: compared to the right graph, the interpretation remains the same although the graph is flipped.
- Sometimes, we use model complexity instead of epochs in the x axis.
- Model complexity refers to the number of learnable weights in a model, and a model with higher complexity is created when the number of layers or units increases.
- While it might seem that making a model more complex would always be better, this isn't actually the case.
- For example, if a model is shaped to fit only the training set well, it will only perform well on the training set.
- This is precisely what happens in the case of overfitting.
Bias-variance Tradeoff
- The relationship between underfitted model(bias) and overfitted model(variance) is called the bias-variance tradeoff.
- The term 'tradeoff' is used because gaining one requires sacrificing the other.
- In other words, the bias-variance tradeoff means that reducing bias(improving training set performance) increases variance(widening the performance gap with the validation set), and conversely, reducing variance(decreasing the performance gap with the validation set) increases bias(lowering training set performance).
- Therefore, we need to select an appropriate middle ground to prevent either variance or bias from becoming too large.
- This act is referred to as selecting an appropriate bias-variance tradeoff.
Reference
Success is not an overnight phenomenon. It's the result of consistent effort and perseverance.- Max Holloway -
반응형
'캐글' 카테고리의 다른 글
[Kaggle Study] 5. Regularization 가중치 규제 (1) | 2024.10.30 |
---|---|
[Kaggle Study] 3. Learning Rate (2) | 2024.10.29 |
[Kaggle Study] 2. Scale of Features (1) | 2024.10.29 |
[Kaggle Extra Study] 9. Plots with Missing Data (3) | 2024.10.28 |
[Kaggle Extra Study] 8. Imputation Techniques for Time Series Data (0) | 2024.10.27 |