Submission and ranking

mlguru · September 21, 2022, 9:22pm

Exactly what data set predictions do we need to include in our submission in phase 1 (Aug 31, 2022 - Sep 29, 2022) and phase 2 (Sep 30, 2022 - Oct 31, 2022)? Test set, validation set, training set, or all? (This is not a “duh” question. I’d appreciate it if the answer can be made as precise as possible.)

During phase 1, the leader board shows the ranking based on the validation set. During phase 2, will the leader board show the ranking based on the test set, or is a leader board ranking simply irrelevant during phase 2?

ishashah · September 22, 2022, 3:46pm

Hi @miguru,

During both Phase 1 and Phase 2, you should submit your predictions for both the validation and test set. The submission format can be found here. Your submission must contains the same rows and columns in the same order as this submission format file in order to be accepted by the scorer. There is a simple benchmark model example here that also contains a section that formats a test submission that might help.

The leaderboard only ever shows the ranking based on the validation set, so you are correct that it is irrelevant during Phase 2.

Good luck, and let us know if you have any more questions!

Isha

rickyjames · September 23, 2022, 12:25am

If you submit an entry during Phase 2 that was trained only on the training set and without using the validation set, will your resulting score be directly comparable to those scores published during Phase 1?

ishashah · September 23, 2022, 5:52pm

Hi @rickyjames ,

Yes, that is correct. Once Phase 2 begins, other participants may use the validation labels in their submissions, which means that the leaderboard may not be a reliable indicator of performance during that phase. Once the validation labels are released, you can also calculated your own leaderboard score.

Good luck!

mlguru · October 27, 2022, 9:31pm

Hello @ishashah,

Just to clarify. The final ranking and awarding for the competition is based solely on the test data portion of the submission file, and not at all on the validation data portion of the submission file, right? In other words, if I write 0.5 for all the validation portion, I could still win if my test portion has great predictions, right?

Thanks,
Henry

ishashah · October 28, 2022, 1:21pm

Hi @mlguru yes, that is correct. Good luck!

Topic		Replies	Views
Phase 2 Questions Mars Spectrometry	3	373	March 18, 2022
Phase 2 - should we predict both test sets or just unseen? Hateful Memes	1	455	October 5, 2020
Clarity on Submission and timeline Mars Spectrometry	7	534	April 4, 2022
Final results submission NASA Airathon	3	408	February 4, 2022
Phase II submissions at now Hateful Memes	1	390	April 29, 2021

Submission and ranking

Related topics