I was making some simple analysis on both files and I notice that there are 68 unique
grids_id in submission_format but only 66 unique
grids_id in train_labels. The two missing grid_ids in train_files corresponds to
Are we suppose to train a model without those grid ids?
Hi @bryanyahir03 - thanks for the observation! Yes, for the NO2 track there are two grid ids in the
submission_format that are not present in
train_labels. Models should be able to generalize to new unseen grid cells.
Thanks for your reply,
There’s other doubt I have with the following statement:
winning solutions must be able to produce predictions for the same grid cells on a single new day.
Is the satellite data availabe in the s3 bucket for the new date or we need to develop a model that downloads the data from the orignal source?
The satellite data already hosted on S3 will remain hosted on S3 for new dates. Your final submission should include any code needed to retrieve ancillary data or other satellite data needed for a new date.