If it is not recommended to train on the reference data, then as per the model output requirement which needs the reference id, how can one connect the training data to the reference data since the model doesn’t know anything about the reference data?
I’d encourage you to check out the “Getting Started” blog post. The passage below might point you in the right direction:
Unlike for a typical supervised machine learning competition, developing a model for this challenge is not as simple as just training a model on provided labeled training data. The query and reference sets are intended for evaluation and not for training.
Hope this helps.