When we processed and extracted the 2019 true( non-private) public dataset, we found some interesting general patterns that might hold for future data (2020). For example some heuristics rule, or some sparsity patterns,… We would like to use this observation into DP algorithm design.
Do you accept that strategy still satisfy differential privacy definition, because we touch the ground-truth data to learn them?
We are aware that if we come up with so many hard rules into our algorithm design, our model might subject to overfitting, thus likely will not perform well on future data. However, we just to make sure that we are allowed to extract general (not specific) knowledge from 2019 data and incorporate them into our algorithm design.