Post Processing

devnikhilmishra · March 24, 2022, 4:11pm

Is post-processing of model predictions allowed ? Especially a metric like log loss leaves much scope for post processing, so just wanted to confirm if it breaks any rules.

jayqi · March 25, 2022, 1:47pm

Hi @devnikhilmishra, post-processing is allowed but must be compliant with the rule about processing test set observations independently. From the rules page:

Unless otherwise specified on the Competition Website, for the purposes of quantitative evaluation of Submissions, Participants agree to process each test data sample independently without the use of information from other cases in the test set. By default, this precludes using information gathered across multiple test samples during training, for instance through pseudo labeling. Eligible Submissions and models must be able to run inference on new test data automatically, without retraining the model.

So, for example, if you are doing calibration on the test data, you can use a calibration function that is fitted on the training and validation data. You should not use a calibration function that is fitted on the test data.

Loki_K · April 1, 2022, 4:06am

So can’t we use techniques like pseudo labeling?

jayqi · April 1, 2022, 2:49pm

@Loki_K Per the rules that I quoted above, you are not allowed to use pseudo-labeling that involves training on any test set data. Your model should not be fit on any test set data.

Topic		Replies	Views
Test Data Question Mars Spectrometry	1	360	March 15, 2022
Continue to test submissions after competition Mars Spectrometry	4	426	February 14, 2023
Clarity on Submission and timeline Mars Spectrometry	7	534	April 4, 2022
Pseudo labeling Hateful Memes	14	1139	November 1, 2020
Phase 2 Questions Mars Spectrometry	3	368	March 18, 2022

Post Processing

Related topics