I was wondering how can we select which out of all the submissions we have is going to be considered for the finale? I don’t think the score on the small sample of the hidden test set is very indicative of the performance on the full dataset.
But whatif we don’t want the best scoring submission to be our submission? There is the possibility of overfitting to this small test set and maybe we want to select a more robust submission. Is that an option? @kwetstone
Hi @aiva00 , the submission that is automatically selected will be the one that performs best on the prize-determining leaderboard (not the public leaderboard).
True black-box testing relies on genuinely hidden test data to ensure meaningful and robust solutions. Allowing all submissions to the leaderboard might encourage brute-force optimization over rigorous validation. For instance, if the data has noise and the theoretical accuracy cap is 88%, the leaderboard might saturate around 86-87%. In such cases, the winner might simply be the participant with the most submissions, exploiting training stochasticity—an arguably flawed setup, in my humble opinion.