[Kaggle Extra Study] 8. Imputation Techniques for Time Series Data

캐글 보충

[Kaggle Extra Study] 8. Imputation Techniques for Time Series Data

dongsunseng 2024. 10. 27. 21:04

We have to take different approach when dealing with time-series data.
The fillna() method is used for imputing missing values in such problem.
Basic Imputation Techniques:
- 'ffill' or 'pad': Replace NaN values with last observed value
- 'bfill' or 'backfill': Replace NaN values with next observed value
- Linear Interpolation method

1. Imputing using 'ffill' or 'pad'

Code Example:

city_day.fillna(method='ffill',inplace=True)
city_day['Xylene'][50:65]

2. Imputing using 'bfill' or 'backfill'

Code Example:

city_day.fillna(method='bfill',inplace=True)
city_day['AQI'][20:30]

3. Linear Interpolation method

Time-series data has a lot of variations against time.
Hence, imputing using backfill or forward fill isn't the best possible solution to address the missing value problem.
A more legitimate alternative would be to use interpolation methods, where the values are filled with incrementing or decrementing values.
Linear interpolation is an imputation technique that assumes a linear relationship between data points and utilizes non-missing values from adjacent data points to compute a value for missing data point.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.interpolate.html

# Interpolate using the linear method
city_day1.interpolate(limit_direction="both",inplace=True)
city_day1['Xylene'][50:65]

Reference

A Guide to Handling Missing values in Python

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

www.kaggle.com

Success is not determined by how many times you fall, but by how many times you get back up.

- Max Holloway -

저작자표시 비영리 변경금지 (새창열림)

'캐글 보충' 카테고리의 다른 글

[Kaggle Extra Study] 10. TabNet (5)	2024.11.04
[Kaggle Extra Study] 9. Plots with Missing Data (4)	2024.10.28
[Kaggle Extra Study] 7. Data Imputation (5)	2024.10.27
[Kaggle Extra Study] 6. Ensemble Method 앙상블 기법 (3)	2024.10.24
[Kaggle Extra Study] 5. Cross Validation 교차 검증 (3)	2024.10.23

현재글[Kaggle Extra Study] 8. Imputation Techniques for Time Series Data

dl, 코인, nodejs, 티스토리챌린지, Prompt Engineering, 오블완, cibmtr - equity in post-hct survival predictions, 매매일지, nlp, 경제, backend, Express, ML, 단타, 캐글, llm, home credit default risk, Kaggle, 비트코인, 투자,

Today :
Yesterday :

일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

동선생

[Kaggle Extra Study] 8. Imputation Techniques for Time Series Data

1. Imputing using 'ffill' or 'pad'

2. Imputing using 'bfill' or 'backfill'

3. Linear Interpolation method

Reference

'캐글 보충' 카테고리의 다른 글

'캐글 보충'의 다른글

티스토리툴바

[Kaggle Extra Study] 8. Imputation Techniques for Time Series Data

1. Imputing using 'ffill' or 'pad'

2. Imputing using 'bfill' or 'backfill'

3. Linear Interpolation method

Reference

'캐글 보충' 카테고리의 다른 글

'캐글 보충'의 다른글

관련글

티스토리툴바