Knowledge extracted from public data violates privacy?

cuongk14 · October 30, 2020, 7:39pm

When we processed and extracted the 2019 true( non-private) public dataset, we found some interesting general patterns that might hold for future data (2020). For example some heuristics rule, or some sparsity patterns,… We would like to use this observation into DP algorithm design.
Do you accept that strategy still satisfy differential privacy definition, because we touch the ground-truth data to learn them?

We are aware that if we come up with so many hard rules into our algorithm design, our model might subject to overfitting, thus likely will not perform well on future data. However, we just to make sure that we are allowed to extract general (not specific) knowledge from 2019 data and incorporate them into our algorithm design.

Thanks

Christine_Task · November 2, 2020, 8:31pm

Yep, you can use absolutely anything you want to from the 2019 data (and only the 2019 data) to inform your algorithm design, without violating differential privacy. The 2019 data we gave you during the development phase is considered to be “Previously Publicly Released Anonymous Data”, and that means that using it does not cause any (new) privacy loss. This is a common real life scenario–where organizations release simply anonymous data for many years before considering switching to formally private data. In general using previously released publicly available data to inform the fit/behavior of your algorithm will improve its performance, and understanding your target data context is a good idea… Overfitting is an issue for accuracy, not privacy

Topic		Replies	Views
Question about incorporating public data Differential Privacy Temporal Map Challenge	1	347	November 5, 2020
With final dataset domain be same as provisional dataset? Differential Privacy Temporal Map Challenge	4	454	February 11, 2021
What are the epsilon values and dataset distributions in the final evaluation? Differential Privacy Temporal Map Challenge	8	510	November 13, 2020
Final Submissions Due Monday 8pm EST (submission prep guidance, + ask questions here) Differential Privacy Temporal Map Challenge	0	429	February 19, 2021
About the Differential Privacy Temporal Map Challenge category Differential Privacy Temporal Map Challenge	0	464	October 1, 2020

Knowledge extracted from public data violates privacy?

Related topics