Annotation of this discussion: https://www.kaggle.com/competitions/equity-post-HCT-survival-predictions/discussion/550152
CIBMTR - Equity in post-HCT Survival Predictions
Improve prediction of transplant survival rates equitably for allogeneic HCT patients
www.kaggle.com
Deep understanding of (C-index) evaluation measure for better model
I will try to explain the C-index evaluation measure of the this competition in order to train the model well because 75% of the data is not included in the test data so understanding of the measure is very important.
Lets start with three patients groups:
- Group A
- Group B
- Group C
For each patient, we will predict risk score (higher score means higher risk of early event).
Step 1: Understanding Concordance Index
The Concordance Index (C-index) evaluate how well the model ranks survival times.
Understand with sample data:
Group A has 3 patients with actual survival times and predicted risk scores:
Comparable pairs:
- (P1, P2): P2 has a shorter survival time and a higher risk score → Concordant ✅
- (P1, P3): P3 has a longer survival time and a lower risk score → Concordant ✅
- (P2, P3): P3 has a longer survival time and a lower risk score → Concordant ✅
Total pairs = 3
Total concordant pairs = 3
C-index for Group A = Concordant pairs/Total pairs= 3/3 = 1.0
Step 2: Calculate C-index for All Groups
Repeat the process for all groups.
For now we can assume:
- Group A: C-index = 1.0
- Group B: C-index = 0.8
- Group C: C-index = 0.6
Step 3: Stratified Concordance Index
The Stratified Concordance Index combines the C-index scores of all groups and focusing on the following:
- Average performance across groups (mean of C-indices).
- Consistency across groups (low standard deviation of C-indices).
Formula:
Stratified C-index = Mean(C-index scores) - Standard Deviation(C-index scores)
- Calculate the mean:
Mean=1.0 + 0.8 + 0.6/3 = 0.8 - Calculate the standard deviation:
Standard Deviation= sqrt((1.0-0.8)^2 + (0.8-0.8)^2 + (0.6-0.8)^/3) = 0.16 - Stratified C-index:
Stratified C-index = 0.8 - 0.16 = 0.64
Step 4: Interpret the Results
A high Stratified C-index means:
- The model predicts well overall (high mean C-index).
- The model predicts equitably across racial groups (low standard deviation).
Finally we can say:
- Group A predictions are perfect (C-index = 1.0).
- Group B is decent (C-index = 0.8).
- Group C struggles (C-index = 0.6).
The Stratified C-index = 0.64 showing that while predictions are good overall, the model is less consistent across groups.
실패를 미리 두려워할 필요는 없다.
- 버트런드 러셀 -