Grid ID's that are in submission_format but not in train_labels

I was making some simple analysis on both files and I notice that there are 68 unique grids_id in submission_format but only 66 unique grids_id in train_labels. The two missing grid_ids in train_files corresponds to 7F1D1 and WZNCR.

Are we suppose to train a model without those grid ids?

Hi @bryanyahir03 - thanks for the observation! Yes, for the NO2 track there are two grid ids in the submission_format that are not present in train_labels. Models should be able to generalize to new unseen grid cells.

1 Like

Thanks for your reply,

There’s other doubt I have with the following statement:

winning solutions must be able to produce predictions for the same grid cells on a single new day.

Is the satellite data availabe in the s3 bucket for the new date or we need to develop a model that downloads the data from the orignal source?

The satellite data already hosted on S3 will remain hosted on S3 for new dates. Your final submission should include any code needed to retrieve ancillary data or other satellite data needed for a new date.

1 Like