Sprint #3 Prescreening (For Prize Eligibility!)

Hello all - This is a a quick overview of the DP Prescreen process for everyone joining us in Sprint #3 (this post will look familiar if you were with us in previous sprints, with a few updates!)

The goal of the challenge is to develop an algorithm that provides good utility on the temporal map problem… while preserving privacy. The leaderboard metric only scores utility, which means the easiest way to get a high score is just to take the ground truth data and submit it right back again with no privacy protection. Which is fair game for the open arena leaderboard, but not really to the objective of the challenge. We also need to ensure your approach satisfies Differential Privacy. The first step to doing that is we ask that you submit a write-up explaining your algorithm and proving that it satisfies differential privacy. So! For your convenience–

  • Resources explaining differential privacy are here
  • Directions for prescreening are here
  • Prescreen submissions are accepted here
  • And a sample submission is in the competitor’s pack, here

SME Panel Prescreening happens every Wednesday morning. If you submit by Tuesday night, we’ll take a look Wednesday and get you feedback promptly. Prescreening is a high-level check for obvious/significant mistakes. We also do a DP Validation review during final scoring, which involves a much closer read of the proof as well as a source code review. Naturally, only valid differentially private solutions are eligible for final prizes.

Pre-screening is important for fun and profit! Once you’ve passed pre-screening, you’ll be cordially invited to submit to a second, prescreened-only leaderboard. All submissions in the prescreened arena rank above all submissions on the open arena board. The easiest way to win a $1K progressive prize on April 26th is to make sure you’ve passed prescreening by April 20th, and then enter the prescreened arena and submit to the leaderboard there. Note that you also must have entered the Prescreened Arena before the end of the development phase on May 10th, in order to be eligible to be invited to final scoring.

One exciting change for this sprint: The live prescreened leaderboard will be calculated based on a separate ground truth data set that shares the same schema as the provided “public” data but is drawn from a different year. This is different from previous sprints, which used the same public ground truth data set for prescreened scoring. Using a separate data set in this sprint will help to give a clearer sense of your solution’s performance on non-public data and encourage these solutions not to hard-code values that may change in different years.

If you’re new to differential privacy, it’s a good idea to submit to prescreening early, and then (optionally) resubmit each time you make a significant algorithm change, to confirm you’re continuing to satisfy differential privacy and to better ensure you’ll be eligible for final prizes. We’ll give you feedback and help you understand any oversights or violations.

Feel free to reply here with any questions that we can help address. Good luck!