Concern about the bonus prize

Hello,

Now that I have reached 1st place on the public leaderboard (at the time I am writing this message), I feel legitimate in expressing a concern.

Both my cross-validation schemes and my submissions tend to show that there is a non-negligible amount of luck involved in the poverty share distribution, to the extent that even the most logical and robust prediction approaches can sometimes be outperformed by less effective techniques. This can happen depending on the actual share distribution.

This issue would likely have been mitigated by a larger private test set (for example, at least five surveys). However, with only two surveys, I am concerned that luck is likely to play a major role,both in the private leaderboard and especially in the bonus prize, which rewards only the very best poverty predictions.

Perhaps it would have been preferable to split the three test sets into six or ten surveys (randomly or otherwise), in order to create more diverse poverty share distributions and reduce the impact of luck on the competition.

What are your thought on this ?

Best regards,
Vincent

The generalizability of predictive modeling is determined by far more than CV splits.

Let me say it clearer : in this competition, I’m convinced that the generalizability of solutions will not be equivalent to rankings on the private leaderboards.

Even the best solutions can be outperformed by less logical solutions, depending on the survey. I have proof of that.That is why I’m concerned that the private leaderboard has only 2 surveys …

What are your cv scores on household consumption?