Track B - training data problem - R->S transitions present

I was just looking through the Track B data as described on Competition: PETs Prize Challenge: Phase 1.

According to the documentation, the model used is a SIR model, stating that “Once an individual is in the recovered R state, they will not be able to become infected again within this simulation.” There is an additional quirk of asymptomatic infections, which manifest as direct transitions from an S to R label.

However, in the va_synthetic_outbreak_training.csv.gz dataset, there are many examples of disallowed transitions.

  • For example, individual 654975 goes SIRSIR; that is to say, they start susceptible, are infected and recovered, but then become susceptible again and reinfected.
  • Alternately, individual 2605690 goes SIS; after they are no longer infected, they become labelled as susceptible again.
  • Looking through the data, there are hundreds of such examples of forbidden R → S transitions, as well as the occasional forbidden I → S transition.

Am I just misunderstanding the description on the website or badly parsing the CSV somehow? It seems that the model description does not match the model actually used to generate the training data.

Thanks so much for any help or clarifications!


Aside: one thought my teammates and I had was that perhaps this is simply meant to model noise in the data. However, we also see examples in the data of RS only, without later infection. That kind of label of removed to susceptible without later being infected does not seem feasible for public health authorities to get in the vast majority of case.

Hi @yunwilliamyu,

Your understanding of the data is correct. The disallowed transitions you found in the dataset were due to an error. That error was fixed and a corrected dataset was published on August 18. There is an announcement about the correction on the announcements page. Please redownload the dataset to get the corrected version. The announcement has file hashes to help you ensure that you have the updated files.

Thanks so much for the update! (and apologies for not having seen that announcement)