Implementation of the bias penalty

odaniel1 · October 6, 2020, 6:48pm

In the calculation section of the problem description, the bias penalty is described as:

2C. Add a bias penalty (BP) of 0.25 if the sum of the raw, unzeroed privatized counts is more than 500 off from the ground truth for row i .

The diagram agrees with this: in the example they take counts [0, 2, 28, 20] and [26, 0, 2, 22], calculate the sum of each to be 50, and conclude that the absolute difference is 0.

In the implementation of the metric provided in the problem repo, it appears that the takes the sum of the term-wise absolute differences, the relevant line being:

bias_mask = np.abs(actual - predicted).sum() > self.allowable_raw_bias
(I couldn’t link to GitHub, but its line 35 in runtime/scripts/metric.py)

So in the original example above this would be the sum:
|0 - 26| + |2 - 0| + |28 - 2| + |20 - 22| = 26 + 2 + 26 + 2 = 56
in this example it is still well below 500, but obviously this has significant ramifications when applied to the real data.

Which of these two metrics is the intended measure? If it is the implemented (L_1) measure then this puts a very tight constraint on the accuracy required: eg. with 178 incident types, on average each count has to deviate by less than 500/178 ~ 2.8.

bull · October 7, 2020, 1:25am

Thanks for the keen eye @odaniel1! That is a bug in the implementation, we’ve fixed it in the runtime repo here:

It has also been fixed on the DrivenData platform and the submissions have been rescored. Looks like some slight changes to submission-level scores, but leaderboard is unchanged.

Let us know if you have any other issues!

odaniel1 · October 7, 2020, 5:53am

Cheers @bull.

One smaller thing I spotted: the description of the metric on the Submissions page (sub-section Primary Evaluation Metric), as well as in the hover-text on the leaderboard page (hover over Best Public) state the metric to be:

PieChartJSD = ∑[1−max(0,𝖩𝖲𝖣i+𝖡𝖯i+𝖬𝖯𝖯i)]

This should be

PieChartJSD = ∑max(0,1 - 𝖩𝖲𝖣i+𝖡𝖯i+𝖬𝖯𝖯i)

or an alternative formulation; the important difference being that 𝖩𝖲𝖣i+𝖡𝖯i+𝖬𝖯𝖯i > 0 by definition so the clipping to [0,1] is not implemented in the first formula.

isms · October 12, 2020, 6:21pm

This has been updated too, thanks @odaniel1

Topic		Replies	Views
The metric.py program Differential Privacy Temporal Map Challenge	7	583	April 16, 2021
Question about sensitivity Differential Privacy Temporal Map Challenge	4	416	October 28, 2020
IMPORTANT: Regarding final submission write-ups Differential Privacy Temporal Map Challenge	2	505	May 16, 2021
Consistency, Bias, Time Sequences: Things to Think About Differential Privacy Temporal Map Challenge	0	416	February 5, 2021
Sprint 3 Results! Differential Privacy Temporal Map Challenge	2	380	June 28, 2021

Implementation of the bias penalty

Related topics