반응형
This post heavily relies on Andrew Ng's lecture:
Hyperparameter Tuning?
- The process of adjusting hyperparameters to optimize the model for better performance
- There are certain hyperparameters that have higer tuning priority than others.
- In other words, some hyperparameters can work well with commonly known values without tuning, while others require tuning to determine which values work best.
- There are various tuning methods such as:
- Manual Search
- Grid Search
- Random Search
- Bayesian Optimization
- Non-Probabilistic
- Evolutionary Optimization
- Gradient-based Optimization
- Early Stopping
- and more
- All methods except Manual Search are called 'Automated Hyperparameter Selection'
- This image indicates that using random values is much better way than using grid to test hyperparameters.
- When $\alpha$ is a hyperparameter that requires a lot of tuning, we can only try fixed amount of values if we use grid method.
- However, when we use random values, we can try much more variety of values.
- 'Coarse to fine' search is a process that we narrow down to smaller range(square in the image) when we find several values that work well and then try more densely in that range of values.
Manual Search
- Also called 'rules of thumb', which referes to setting hyperparameter values based on experience or intuition.
- In reality, it is convenient to follow commonly known values since they usually perform well and only require minor code adjustments.
- However, the downside is that it's difficult to compare performance across different hyperparameter combinations.
Grid Search
Reference
Excellence is not a destination; it is a continuous journey that never ends.
- Conor Mcgregor -
반응형
'캐글' 카테고리의 다른 글
[Kaggle Study] 13. Normalization 정규화 (0) | 2024.11.15 |
---|---|
[Kaggle Study] 12. Early Stopping (0) | 2024.11.15 |
[Kaggle Study] 11. Data Augmentation (0) | 2024.11.15 |
[Kaggle Study] 10. About Structuring ML Projects (4) - End-to-end learning (0) | 2024.11.14 |
[Kaggle Study] 9. About Structuring ML Projects (3) - Transfer learning & Multi-task learning (1) | 2024.11.14 |