CV/LB correlation

Hello everyone.

I’m having some discrepancies between cv and lb.
For now is -0.08 in the leaderboard.

Has anyone managed to find a good validation set?

I’ve found there’s a discrepancy between performance on my validation set and the test set too. Below 90% top-10 test accuracy my validation top-10 accuracy was consistently 3-5% higher. They seem to be converging now that I am above 90% accuracy though. I wouldn’t worry too much though, as I’ve made quite a few submissions now, and I’ve always found a big increase in validation accuracy leads to a proportional increase in test accuracy. For reference, I just select a small percentage of the training set at random for my validation set and ensure that there is a proportional number of samples for each class within it.