Test set / scenarios approximating submission MAP?

My local MAP differs pretty radically from my submission score. Has anyone figured out a technique or specific data set that approximates the submission results a little more accurately?

With only 3 submissions / week it’s very challenging not to have a pretty-good local approximator.