Track B output format

Hi @jayqi,

We have some last minute clarification questions regarding the output format for Track B (pandemic):

  • Does the evaluation pipeline depend on the fact that the predicted scores are between 0 and 1? Or will we receive the same score as long as the ranking is equivalent (which is expected for AUPRC)?
  • Is each federation unit’s output evaluated separately or are all output predictions concatenated across units for a global AUPRC score?

Thank you,

Hi @hhcho,

  1. I believe confidence scores outside of [0.0, 1.0] should technically work and give you the same AUPRC score. However, it is not officially documented or supported, so you would be doing so at your own risk.
  2. For the federated solution evaluation, the output from all federation units are concatenated together and used to calculate a global AUPRC score. This is done separately for each of the three partitioning scenarios, so your federated solution will produce three AUPRC scores.