[Kaggle Study] 4. Overfitting, Underfitting, Variance and Bias

캐글

[Kaggle Study] 4. Overfitting, Underfitting, Variance and Bias

dongsunseng 2024. 10. 29. 17:02

Overfitting and Underfitting

Overfitting refers to cases where a model performs well on the training set but shows poor performance on the validation set.
Underfitting occurs when there isn't big difference between training and validation set performance, but both show poor performance.

Learning Curves

Learning curve: Graph showing the model's learning progress
Learning curve at left is a typical graph of underfitted model.
- While the performance gap between training and validation sets gradually narrows, the overall performance remains low.
- This is why an underfitted model is also said to have high bias.
- Underfitting occurs when a model is not complex enough to capture all the patterns present in the training data.
- One of the most popular methods to solve underfitting is to use more complex model or relax weight regularization.
Learning curve at top right is a typical graph of overfitted model.
- There is a large gap between the performance measured in the training set and validation set.
- That is why we also say that an overfitted model has high variance.
- One of the main causes of overfitting is when the training set doesn't include samples with sufficiently diverse patterns.
- Because the training set lacks diverse pattern samples, it fails to properly adapt to the validation set.
- In such cases, you can improve validation set performance by collecting more training samples.
- However, there may be cases where collecting more training samples is not feasible due to practical limitations.
- In these situations, we can restrict the weights to prevent the model from overfitting to the training set.
- This is also referred to as reducing the model's complexity.
Learning curve at bottom right shows a balance between overfitting and underfitting.

Epoch-Loss graph for analyzing overfitting & underfitting

We can also utilize the loss function over epochs to analyze overfitting and underfitting.

The right graph shows the loss for validation and training sets.
While the training set loss decreases as epochs progress, the validation set loss actually increases after passing the optimal point marked by the red dotted line.
This is because if we continue training the model with the training set after the optimal point, the model fits more tightly to the training set samples, indicating the start of overfitting.
Conversely, before the optimal point, the losses of both training and validation sets decrease while maintaining a similar gap.
If training is stopped in this region, it results in an underfitted model.
The left graph uses accuracy instead of loss on the vertical axis: compared to the right graph, the interpretation remains the same although the graph is flipped.
Sometimes, we use model complexity instead of epochs in the x axis.
- Model complexity refers to the number of learnable weights in a model, and a model with higher complexity is created when the number of layers or units increases.
- While it might seem that making a model more complex would always be better, this isn't actually the case.
- For example, if a model is shaped to fit only the training set well, it will only perform well on the training set.
- This is precisely what happens in the case of overfitting.

Bias-variance Tradeoff

The relationship between underfitted model(bias) and overfitted model(variance) is called the bias-variance tradeoff.
The term 'tradeoff' is used because gaining one requires sacrificing the other.
In other words, the bias-variance tradeoff means that reducing bias(improving training set performance) increases variance(widening the performance gap with the validation set), and conversely, reducing variance(decreasing the performance gap with the validation set) increases bias(lowering training set performance).
Therefore, we need to select an appropriate middle ground to prevent either variance or bias from becoming too large.
This act is referred to as selecting an appropriate bias-variance tradeoff.
However, the bias-variance tradeoff is likely not a major concern for modern machine learning models(only concern in pre-deep learning era).
With the tremendous advancement in neural networks, having high bias doesn't necessarily mean lower variance, or vice versa.
In other words, there's a significant possibility of models having both high bias and high variance, or both low bias and low variance.
Therefore, if bias is high, you can typically address it by training for longer periods or building larger models. If variance is high, you can use methods like increasing the training dataset size or implementing regularization techniques.
- Additionally, applying different neural network architectures is likely to be effective in addressing both bias and variance aspects.
Due to modern technological advances, it's better to understand that we no longer need to worry about the tradeoff between bias and variance - we now have tools that can address one problem without affecting the other.
Furthermore, training bigger networks never hurts. And the main cost of training a bigger network that's too big is just computational time.

Reference

Do it! 딥러닝 입문

★★★★★ 딥러닝을 배우고자 하는분께 강추합니다!(wtiger85 님) ★★★★★ 강추. 박해선님의 책은 일단 지른 다음에 생각합니다.(heistheguy 님) ♥♥♥♥ 코랩을 사용한 딥러닝을 알려주는 책 매

tensorflow.blog

Overfitting and Methods of Addressing it - CFA, FRM, and Actuarial Exams Study Notes

Understand the concepts of bias and variance errors, generalization, and how to optimize model complexity for better predictive performance.

analystprep.com

Underfitting and Overfitting in Machine Learning | Baeldung on Computer Science

Explore overfitting and underfitting in machine learning.

www.baeldung.com

심층 신경망 개선하기: 하이퍼파라미터 튜닝, 정규화 및 최적화

DeepLearning.AI에서 제공합니다. 딥러닝 스페셜라이제이션의 두 번째 과정에서는 딥러닝 블랙박스를 열어 성능을 향상시키고 좋은 결과를 도출하는 프로세스를 체계적으로 이해합니다. 딥러닝 애

www.coursera.org

Success is not an overnight phenomenon. It's the result of consistent effort and perseverance.

- Max Holloway -

저작자표시 비영리 변경금지 (새창열림)

'캐글' 카테고리의 다른 글

[Kaggle Study] 6. Logistic Loss Function (0)	2024.11.07
[Kaggle Study] 5. Regularization 가중치 규제 (4)	2024.10.30
[Kaggle Study] 3. Learning Rate (2)	2024.10.29
[Kaggle Study] 2. Scale of Features (1)	2024.10.29
[Kaggle Study] #1 Titanic - Machine Learning from Disaster (2)	2024.10.26

현재글[Kaggle Study] 4. Overfitting, Underfitting, Variance and Bias

backend, 캐글, cibmtr - equity in post-hct survival predictions, 경제, Express, 비트코인, nodejs, Prompt Engineering, llm, 오블완, dl, nlp, 투자, home credit default risk, 티스토리챌린지, 매매일지, 단타, Kaggle, ML, 코인,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

동선생