First off, everyone who has submitted to DP Prescreening has now passed. Congratulations! If you haven’t submitted to prescreening yet, our last review will be held at 9am ET, next Wednesday 11/4. Make sure to get your algorithm write-up in before then!
Second, we’re getting closer to the Nov 9th deadline for invitations to Final Scoring. In order to be invited to final scoring and be eligible for the remaining $25,000 in prizes for Sprint #1, you need to do the following:
(1) Have passed prescreening (11 teams have done this!)
(2) Have submitted your solution to the “prescreened arena” (7 teams have done this!)
(3) Get a reasonably competitive score (better than the naive baseline solution) on the prescreened leaderboard (7 teams have done this!)
Note! If you’ve been following along on the open arena, but aren’t sure you’ll be able to make it in through the remaining steps for the Sprint #1 final scoring by Nov 9th, remember that this is only the first sprint of a three sprint match. Stay tuned! You’ll have another opportunity to jump into the competition this winter, and even a third opportunity next spring! Participating in Sprint #1 isn’t required to join in later sprints.
Once you’ve completed the above steps and are invited to final scoring, you’ll have until November 15th to prepare your final scoring submission. The final scoring will include a source code review and a much more thorough differential privacy verification (please be kind to your reviewers and comment your code!).
We’ll ask that you provide with your final scoring submission:
(1) Your executable code (just like you submit to the prescreen leaderboard)
(2) Your source code (commented and organized (but not so drastically last-minute refactored that you accidentally break it immediately before submission (this week, for instance, is a good refactoring week)))
(3) A Code Guide which shows where exactly in the source code to find each step of the privatization algorithm
…and also tells us which category each file/component of the code falls into:
- Pre-processing (processes each individual’s records independently of all others to incur 0 sensitivity cost),
- Privatization (processes whole data-set, incurs sensitivity cost and adds appropriate noise to account for it)
- Post-processing (only touches privatized data and has no contact with raw input data).
(4) And finally, a clear and complete write-up of your differentially private algorithm and privacy proof. This should be a bit longer and more thorough than your prescreen submission if you’re doing something more than simple counts… it needs to clearly define what you’re doing at each step, so we can double check your logic. Make sure all of your notation is clearly defined. If you’re referencing a research paper, add a link to an accessible pdf, and include relevant excerpts in your writeup.
If we have questions about steps in your approach, we’ll reach out to you (you’ll receive an email from DrivenData asking for clarification), and we’ll need you to respond in order to maintain prize eligibility. Rather than anxiously watching your inbox over thanksgiving, it’s best to be clear and thorough in your write-up the first time around. That may even help you catch any small mistakes yourself, before we get to them.