We would like to encourage other teams to share their results, so that we can all have fruitful discussions and learn from each other. Therefore, here you have the confusion matrix for an image model trained on just one fold of our local validation set.
This model achieves a loss of 0.28 in our local validation set and a loss of 0.6272 on the leaderboard.
Also, we are still open to find new teammates, so if you are interested drop us a line!
How come there is huge difference between local(CV) loss and leaderboard loss.
I am also facing the same issue, but still this difference seems to be huge.
For me it is 0.44 in local CV and 0.67 on leaderboard. I would like to team up to explore more and see if we can get something new.
My best single fold score was ~0.42 for both CV and LB.
Yeah, a good validation set is representative of the data the model will see in the future, which in our case is the test set. Rachel Thomas has a good article on items to consider when creating validation sets.