I have one submission left and obviously am no threat to the leaderboard (my LB is .13). Anyone want to offer hint / thoughts about creating a local metric that correlates well with the online score?
I understand that I’ve probably overfit, but I don’t understand how I’d address this since I can’t come up with a local metric that correlates well with the online score. Any thoughts?
If my experience is unusual – if you’ve seen good correlation – I’d love to hear that, too, since ‘you made a mistake in your code’ is something that I certainly understand!
FWIW, I am using Pytorch Metric Learning’s AccuracyCalculator
(with SubCenterArcFaceLoss
) that I use as a metric against my validation set every n
epochs. When I score the specific sample scenarios my MAP is lower, at around .7, but still much, much higher than the LB.