My submission was second some hours ago and right know is in 184.
The system doesn’t select my best submission
My submission was second some hours ago and right know is in 184.
The system doesn’t select my best submission
I think it’s the poverty rate not metric itself. Many overfit for public survey and small error in rest would make the score jumps.
My submission initially ranked approximately 100th on the public leaderboard and has since improved to 22nd. While this movement is encouraging, I recognize the limitations of leaderboard-based evaluation in this competition.
The official rules clearly state that only one of the three test-set surveys is used to populate the public leaderboard, while the remaining two—whose scores are withheld—are ultimately used to determine final rankings. As noted in the competition guidelines, “this may result in the leaderboard serving as a poor indicator of where you will eventually place in the competition.”
This design underscores that robust generalization—not leaderboard optimization—was the true objective. Analysis of the training data revealed that one survey ID exhibited stronger predictive signal, and among the test surveys, one closely mirrored the training set distribution. The other two, however, displayed a notable distributional shift—potentially reflecting differences in framing, such as poverty-reduction versus asset-based measurement approaches. Successfully modeling across these divergent distributions was therefore the central challenge.
Given this context, I believe participants would benefit from greater transparency during the competition phase. Specifically, I would appreciate visibility into submission scores across all three submissions—not just the one selected for the leaderboard. In my own experimentation, I applied methodologically rigorous techniques aligned with World Bank practices, including household-level imputation generating one hundred estimates per HHI. These approaches yielded worst performance on the public leaderboard. Without insight into how these submissions performed on the withheld panels, I ultimately opted for a more conservative, middle-ground strategy—prioritizing leaderboard stability over methodological ambition.
Greater access to multi-submission feedback would empower participants to make more informed modeling decisions and better align their efforts with the competition’s stated goal: building solutions that generalize across heterogeneous survey populations.