Submission and ranking

Exactly what data set predictions do we need to include in our submission in phase 1 (Aug 31, 2022 - Sep 29, 2022) and phase 2 (Sep 30, 2022 - Oct 31, 2022)? Test set, validation set, training set, or all? (This is not a “duh” question. I’d appreciate it if the answer can be made as precise as possible.)

During phase 1, the leader board shows the ranking based on the validation set. During phase 2, will the leader board show the ranking based on the test set, or is a leader board ranking simply irrelevant during phase 2?

Hi @miguru,

During both Phase 1 and Phase 2, you should submit your predictions for both the validation and test set. The submission format can be found here. Your submission must contains the same rows and columns in the same order as this submission format file in order to be accepted by the scorer. There is a simple benchmark model example here that also contains a section that formats a test submission that might help.

The leaderboard only ever shows the ranking based on the validation set, so you are correct that it is irrelevant during Phase 2.

Good luck, and let us know if you have any more questions!

Isha

1 Like

If you submit an entry during Phase 2 that was trained only on the training set and without using the validation set, will your resulting score be directly comparable to those scores published during Phase 1?

Hi @rickyjames ,

Yes, that is correct. Once Phase 2 begins, other participants may use the validation labels in their submissions, which means that the leaderboard may not be a reliable indicator of performance during that phase. Once the validation labels are released, you can also calculated your own leaderboard score.

Good luck!