반응형
- Early stopping is a technique used while training neural networks to prevent the model from overfitting.
- It basically stops the training process before the model starts to overfit.
- We monitor the model's performance on a validation set during the training process and stop the training process when the general performance starts to degrade, which indicates that the model is beginning to overfit the training data.
- That is, the goal is to stop the training where the model performs the best.
- Disadvantage of this technique:
- We have 2 big steps during training ml models:
- 1. optimize cost function J using various algorithms such as Gradient Descent
- 2. Make sure the model doesn't overfit
- We can do this by collecting more data, apply regularization, ...
- It is much better to focus on each task one at a time and this is possible these days due to advanced ml techniques & algorithms.
- However, early stopping techniques couples these 2 big tasks which means each steps greatly affects each other.
- We can no longer work on each task independently.
- We have 2 big steps during training ml models:
- Advantage of this technique:
- Other regularization techniques such as L2 regularization requires hyperparameter especially lambda.
- This means that we should take the computational cost to find the best hyperparameter value.
- However, by using early stopping, we can process the regularization by simply training the model and find the weights that makes the model perform best.
- Other regularization techniques such as L2 regularization requires hyperparameter especially lambda.
You can find further information about this issue(disadvantage of early stopping) from my previous post:
[Kaggle Study] 7. About Structuring ML Projects (1)
This post is a summary of Coursera Andrew Ng's lecture: 머신 러닝 프로젝트 구조화DeepLearning.AI에서 제공합니다. 딥러닝 전문 과정의 세 번째 과정에서는 성공적인 머신러닝 프로젝트를 구축하는 방법을 배
dongsunseng.com
Reference
심층 신경망 개선하기: 하이퍼파라미터 튜닝, 정규화 및 최적화
DeepLearning.AI에서 제공합니다. 딥러닝 스페셜라이제이션의 두 번째 과정에서는 딥러닝 블랙박스를 열어 성능을 향상시키고 좋은 결과를 도출하는 프로세스를 체계적으로 이해합니다. 딥러닝 애
www.coursera.org
My success isn't the result of arrogance, it's the result of belief.
- Conor Mcgregor -
반응형
'캐글' 카테고리의 다른 글
[Kaggle Study] 14. Hyperparameter Tuning (1) | 2024.11.16 |
---|---|
[Kaggle Study] 13. Normalization 정규화 (0) | 2024.11.15 |
[Kaggle Study] 11. Data Augmentation (0) | 2024.11.15 |
[Kaggle Study] 10. About Structuring ML Projects (4) - End-to-end learning (0) | 2024.11.14 |
[Kaggle Study] 9. About Structuring ML Projects (3) - Transfer learning & Multi-task learning (1) | 2024.11.14 |