What is the private score?

What is the difference between score and private score and why it both so differ?

The public score is when your model is only evaluated on a part of the test data. The private score is when your model is evaluated on all data. The reason for having these two scores is to avoid people to create an algorithm that overfits on the leaderboard by taking the score as feedback. As an example, someone got a perfect public score on a binary classification problem for the Data Science Bowl in 2017 with 1 million prize pool, after 16 (!!!) submissions: https://www.kaggle.com/c/data-science-bowl-2017/discussion/27800

The reason why there’s such a large difference is unknown to me as well. Probably, the private and public part of the test set has been decided/sampled at random, and some differences between both distributions have emerged… Moreover, there’s of course always some variance on your model’s performance, based on which data is used to evaluate.

1 Like