동선생

[Kaggle Study] 15. Why Use Convolutional Layer?

This post heavily relies on Andrew Ng's lecture: 컨볼루션 신경망DeepLearning.AI에서 제공합니다. 딥러닝 전문 과정의 네 번째 과정에서는 컴퓨터 비전이 어떻게 발전해 왔는지 이해하고 자율주행, 얼굴 인식, 방사선 이미지 판독 등 흥미로운 응용 분야에 익숙해www.coursera.orgAdvantages of Convolutional Layers Over Fully Connected LayersFollowing above example, if we use Fully-connected layer instead of Convolutional layer, we should connect 3,072 and 4,704 neurons.That is, the number..

캐글 2024.11.20

[Kaggle Study] #2 Porto Seguro's Safe Driver Prediction

Second competition following Yuhan Lee's kaggle curriculum. Binary classification competition using tabular data.First KernelData preparation and exploration kernel.Insights / Summary:1.info method (e.g. train.info()) provides data type, null value existence, number of rows of each variable of the dataframe.2. Storing metadata for each variable might help data management.Code example:data = []fo..

캐글 2024.11.19

[NLP] 1. Natural Language Processing Basics

This post heavily relies on the book 'Natural Language Processing with Pytorch':https://books.google.co.kr/books?id=AIgxEAAAQBAJ&printsec=copyright&redir_esc=y#v=onepage&q&f=falseNLP?A set of techniques that solve practical problems using statistical methods to understand text, regardless of linguistic knowledge.The 'understanding' of text is mainly achieved by converting text into computable re..

NLP 2024.11.18

[Kaggle Extra Study] 17. Multiclass Classification Threshold Optimization 다중분류 임계값 최적화

Multiclass Classification can be divided into 2 categories: ordinal classification and nominal classification.For nominal(명목형) classification problems, you can think of a multiclass classification algorithm that outputs probability distributions like [0.7, 0.1, 0.2] for distinguishing between car, human, and tree.For ordinal classification, you can think of a problem that categorizes a child's c..

캐글 보충 2024.11.17

[Kaggle Study] 14. Hyperparameter Tuning

Hyperparameter Tuning?The process of adjusting hyperparameters to optimize the model for better performanceThere are certain hyperparameters that have higer tuning priority than others.In other words, some hyperparameters can work well with commonly known values without tuning, while others require tuning to determine which values work best.There are various tuning methods such as:Manual SearchG..

캐글 2024.11.16

[Kaggle Study] 13. Normalization 정규화

This post heavily relies on Andrew Ng's lecture: 심층 신경망 개선하기: 하이퍼파라미터 튜닝, 정규화 및 최적화DeepLearning.AI에서 제공합니다. 딥러닝 스페셜라이제이션의 두 번째 과정에서는 딥러닝 블랙박스를 열어 성능을 향상시키고 좋은 결과를 도출하는 프로세스를 체계적으로 이해합니다. 딥러닝 애www.coursera.orgNormalizationNormalization(정규화) 은 training process의 속도를 높여주는 기법들 중 하나입니다. 주의 해야할 점 중 하나는 test set에도 같은 mean, variance 값을 사용하여 정규화해줘야 올바른 test 결과값이 나온다는 것입니다.Unnormalized cost function과 Norma..

캐글 2024.11.15

[Kaggle Study] 12. Early Stopping

Early stopping is a technique used while training neural networks to prevent the model from overfitting.It basically stops the training process before the model starts to overfit.We monitor the model's performance on a validation set during the training process and stop the training process when the general performance starts to degrade, which indicates that the model is beginning to overfit the..

캐글 2024.11.15

[Kaggle Study] 11. Data Augmentation

Deep learning fundamentally requires a large amount of data for effective training.Additionally, to solve the chronic problem of overfitting in deep learning, it needs sufficient high-quality data.However, increasing the amount of data requires significant cost and time, and in some cases, it can be difficult to even collect or process the data.To address this issue, various Data Augmentation te..

캐글 2024.11.15

[Kaggle Study] 10. About Structuring ML Projects (4) - End-to-end learning

This post is a summary of Coursera Andrew Ng's lecture: 머신 러닝 프로젝트 구조화DeepLearning.AI에서 제공합니다. 딥러닝 전문 과정의 세 번째 과정에서는 성공적인 머신러닝 프로젝트를 구축하는 방법을 배우고 머신러닝 프로젝트 리더로서 의사 결정을 연습할 수 있습니다. 이www.coursera.orgEnd-to-end Deep Learning#1 What is End-to-end Deep Learning?딥러닝/머신러닝 알고리즘은 다수의 processing stage를 거치기 마련인데 End-to-end deep learning은 여러 개의 stage들을 하나의 뉴럴 네트워크로 replace 하는 방식임위의 예시에서는 audio 데이터의 feature를..

캐글 2024.11.14

[Kaggle Study] 9. About Structuring ML Projects (3) - Transfer learning & Multi-task learning

This post is a summary of Coursera Andrew Ng's lecture: 머신 러닝 프로젝트 구조화DeepLearning.AI에서 제공합니다. 딥러닝 전문 과정의 세 번째 과정에서는 성공적인 머신러닝 프로젝트를 구축하는 방법을 배우고 머신러닝 프로젝트 리더로서 의사 결정을 연습할 수 있습니다. 이www.coursera.org 이 포스트는 전 포스트의 내용과 이어집니다: [Kaggle Study] 8. About Structuring ML Projects (2)This post is a summary of Coursera Andrew Ng's lecture: 머신 러닝 프로젝트 구조화DeepLearning.AI에서 제공합니다. 딥러닝 전문 과정의 세 번째 과정에서는 성공적인 머신..

캐글 2024.11.14

[Kaggle Study] 8. About Structuring ML Projects (2) - Error Analysis & Incorrectly labeled / Mismatch data

This post is a summary of Coursera Andrew Ng's lecture: 머신 러닝 프로젝트 구조화DeepLearning.AI에서 제공합니다. 딥러닝 전문 과정의 세 번째 과정에서는 성공적인 머신러닝 프로젝트를 구축하는 방법을 배우고 머신러닝 프로젝트 리더로서 의사 결정을 연습할 수 있습니다. 이www.coursera.org이 포스트는 전 포스트의 내용과 이어집니다: [Kaggle Study] 7. About Structuring ML Projects (1)This post is a summary of Coursera Andrew Ng's lecture: 머신 러닝 프로젝트 구조화DeepLearning.AI에서 제공합니다. 딥러닝 전문 과정의 세 번째 과정에서는 성공적인 머신러..

캐글 2024.11.13

[Kaggle Study] 7. About Structuring ML Projects (1)

This post is a summary of Coursera Andrew Ng's lecture: 머신 러닝 프로젝트 구조화DeepLearning.AI에서 제공합니다. 딥러닝 전문 과정의 세 번째 과정에서는 성공적인 머신러닝 프로젝트를 구축하는 방법을 배우고 머신러닝 프로젝트 리더로서 의사 결정을 연습할 수 있습니다. 이www.coursera.orgIntroduction위의 예를 보면 알 수 있듯이 머신러닝 모델이 원하는 성능보다 저조한 성능을 보였을 때 취할 수 있는 조치는 수없이 많습니다. 머신러닝 프로젝트를 진행할 때 수없이 많은 방법들 사이에서 어떤 부분을 집중적으로 개선할지 판단할 수 있는 기준점을 세워야 기한이 정해진 프로젝트에서 원하는 성과를 낼 수 있습니다. Orthogonalizatio..

캐글 2024.11.12

[Kaggle Extra Study] 16. Handling Categorical Variables

What makes handling Categorical Variables important?Categorical variables - like colors, cities, or education levels - cannot be directly used in machine learning models since these models only understand numbers.Categorical variable encoding solves this problem by converting these text-based categories into numerical formats that machines can process.The challenge lies in choosing the right enc..

캐글 보충 2024.11.11

[Kaggle Extra Study] 15. GBM vs. XGBoost

In this post, I want to talk about the difference between GBM(Gradient Boosting Machine) and XGBoost. Reading the previous post will be helpful for your understanding: [Kaggle Extra Study] 14. Tree-based Ensemble ModelsTree-based Ensemble Models?LightGBM, XGBoost, CatBoost, Random Forest are all tree-based ensemble models.These algorithms are popular on Kaggle because of their high prediction p..

캐글 보충 2024.11.10

[Kaggle Extra Study] 14. Tree-based Ensemble Models

Tree-based Ensemble Models?LightGBM, XGBoost, CatBoost, Random Forest are all tree-based ensemble models.These algorithms are popular on Kaggle because of their high prediction performance, fast learning speed, and ability to handle various data types.They show particularly powerful performance in analyzing structured data."Tree-based": These algorithms all use decision trees as their basic buil..

캐글 보충 2024.11.10

동선생

전체 글 112

티스토리툴바

« 2026/02 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28