Different results on personal test set and competition test set

jearly · November 30, 2019, 7:37pm

I’ve split the labelled data into train/validation/test datasets. I can produce models that generalise well to the test set I’ve created, however they have lower performance on the (unlabelled) competition data. The gain in loss from my test results to the competition results is ~40% (i.e. 0.4 to about 0.56).

Is anyone else having this same issue? Is there a fundamental difference in the labelled and unlabelled data (e.g. taken from different geographical locations) that I’m missing?

ocitalis · December 6, 2019, 2:57am

Are you using the unverified examples? Only use those with extreme care, whatever automated process labelled them is not reliable.

jearly · December 6, 2019, 9:34pm

Turns out there was a bug in my code - I thought I was using only verified examples but the unverified ones had slipped in somehow! I’ve re-trained my model now and the losses are far more similar, plus the predicted class distribution of the competition data seems much closer to the original dataset.

Thanks for pointing this out!

sanket10 · December 11, 2019, 7:21pm

It seems to me that, I am still facing this issue. After reaching to 0.5 loss, my CrossValidation and LB are giving different results (eg, CV= 0.43, LB= 0.68).

Topic		Replies	Views
No discussion for this competition? Mapping Disaster Risk from Aerial Imagery	7	651	December 16, 2019
Data Quality Issues? Mapping Disaster Risk from Aerial Imagery	3	820	December 14, 2019
1-fold confusion matrix Mapping Disaster Risk from Aerial Imagery	6	606	December 17, 2019
Test Labels are missing Flu Shot Learning	2	823	July 29, 2021
Is your performance on training data quite different from that on testing data? Senior Data Science: Safe Aging with SPHERE	5	1662	July 8, 2016

Different results on personal test set and competition test set

Related topics