[Kaggle Study] #15 2017 Kaggle Machine Learning & Data Science Survey

캐글

[Kaggle Study] #15 2017 Kaggle Machine Learning & Data Science Survey

dongsunseng 2024. 12. 5. 00:57

Fourteenth(Last) course following Youhan Lee's curriculum. Not competition.

First Kernel: Novice to Grandmaster

The biggest problem that we might face is fake and bogus responses.
As it is a survey, not everyone will answer with proper credentials, and thus I assume that there will be a lot many outlier.

Second Kernel: What do Kagglers say about Data Science ?

EDA Kernel with trying some prediction with modeling techniques.

Insight / Summary:

1. Dimensionality reduction and 2D-plotting

The most known / used dimensionality reduction technique has to be PCA.
The problem with PCA is that it works best for numerical / continuous variables which is not the case here.
A similar technique, Multi Correspondence Analysis (MCA), is used to achieve dimensionality reduction for categorical data.
Simply put, It's a technique that use chi-2 independence tests to create a distance between row points that will be further contained in a matrix.
Each of the eigenvalues of this matrix has an inertia (similar to expressed variance for PCA) and the process to obtain the 2D visualization is the same.

### NOT WORKING ON KAGGLE SERVERS (no module prince)####
#import prince
#np.random.seed(42)
#mca = prince.MCA(data_viz, n_components=2,use_benzecri_rates=True)
#mca.plot_rows(show_points=True, show_labels=False, color_by='CompensationAmount', ellipse_fill=True)

Third Kernel: PLOTLY TUTORIAL - 1

Literally plotting plots analyzing response data using PLOTLY.

The first step is to establish that something is possible; then probability will occur.
- Elon Musk -

저작자표시 비영리 변경금지 (새창열림)

'캐글' 카테고리의 다른 글

[Kaggle Study] #13 Mercari Price Suggestion Challenge (0)	2024.12.06
[Kaggle Study] #14 Toxic Comment Classification Challenge (0)	2024.12.04
[Kaggle Study] #12 Spooky Author Identification (0)	2024.12.04
[Kaggle Study] #11 Credit Card Fraud Detection (0)	2024.12.03
[Kaggle Study] #10 Zillow Prize: Zillow’s Home Value Prediction (Zestimate) (0)	2024.11.29

현재글[Kaggle Study] #15 2017 Kaggle Machine Learning & Data Science Survey

dl, 티스토리챌린지, Prompt Engineering, 경제, Kaggle, llm, Express, nodejs, ML, 오블완, 코인, 비트코인, 투자, 단타, nlp, cibmtr - equity in post-hct survival predictions, backend, 매매일지, 캐글, home credit default risk,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

동선생

[Kaggle Study] #15 2017 Kaggle Machine Learning & Data Science Survey

First Kernel: Novice to Grandmaster

Second Kernel: What do Kagglers say about Data Science ?

Third Kernel: PLOTLY TUTORIAL - 1

'캐글' 카테고리의 다른 글

'캐글'의 다른글

티스토리툴바

[Kaggle Study] #15 2017 Kaggle Machine Learning & Data Science Survey

First Kernel: Novice to Grandmaster

Second Kernel: What do Kagglers say about Data Science ?

Third Kernel: PLOTLY TUTORIAL - 1

'캐글' 카테고리의 다른 글

'캐글'의 다른글

관련글

티스토리툴바