대회

CIBMTR - Equity in post-HCT Survival Predictions #8 Finding the best target transformation

dongsunseng 2025. 2. 5. 23:55
반응형

Annotation post on discussion about finding the best target transformation

https://www.kaggle.com/competitions/equity-post-HCT-survival-predictions/discussion/550835

 

CIBMTR - Equity in post-HCT Survival Predictions

Improve prediction of transplant survival rates equitably for allogeneic HCT patients

www.kaggle.com

Finding the best target transformation

  • The competition task can be interpreted as predicting the order of death of the patients.
  • Who dies first? Who dies second? … Who dies last, and who survives?
  • With a suitable target transformation, we can apply the usual regression algorithms which optimize mse or similar metrics.
  • The original target is distributed in such a way that most patients who die have an efs_time between 0 and 15, whereas most survivors have an efs_time between 15 and 160.
  • This distribution is an impediment(장애) for regression models. 
  • We need predictions which have high discriminative power for the patients who die, but we don't need to distinguish between survivors.
  • We can achieve this result by stretching the range of the patients who die and compressing the range of the survivors.
  • The diagram visualizes how a typical target transformation stretches and compresses the ranges:

  • In the public notebooks of this competition, we can find various target transformations, and most of them are similar.
  • For a comparison, I've taken three target transformations from public notebooks, added a fourth one, and given them all to XGBRegressor with an mse objective.
  • The cross-validation scores confirm that the orange part of the histogram must be stretched and the blue part must be condensed:

  • A comparison with other model types shows that target-transformed mse models (pink) are competitive with Cox proportional hazards models (blue).
  • My AFT models (green) perhaps need more hyperparameter tuning.

  • NN starter code annotation here: 

  • Maybe I should check on Nelson-Aalen

Source code is in the EDA which makes sense.

 

ESP EDA which makes sense ⭐️⭐️⭐️⭐️⭐️

Explore and run machine learning code with Kaggle Notebooks | Using data from CIBMTR - Equity in post-HCT Survival Predictions

www.kaggle.com

My annotation:


지금 당장 꽃을 피우지 못했다고 해서 좌절하지 마세요. 친구와 비교하지도 마세요.
지금은 그저 나의 계절이 아닌 것뿐이에요.
<책 '모든 꽃이 봄에 피지는 않는다'중에서>
반응형