Data files and train labels

I am looking at the downloaded data files and trying to match the filename with the train labels.
For example, in Train_label.csv the first line is


Then I am looking at the training data files by the date 20180201 (Feb. First 2018), I don’t see a matching file. Someone please help me with understanding what I am missing.

I think you might be looking at the wrong file; if you download the training labels CSV files, the first line (after the header) should be:
If you then open up the satellite metadata file, you can look for a corresponding satellite data granule for this date. For example, the first file listed there is:
Which would be a sub-set of the TROPOMI data collected on the same day, covering the “LA” area.
I think you may have the incorrect files, since I don’t think that February 2018 is covered by any of the datasets; I would suggest you re-download the training labels file and take another look.

@nayeemmz It may also be the case that satellite data is missing for some samples. This is just a reality of the temporal sparsity of the satellite data. See the thread here for more discussion on the topic: Satellite data not available for some train/test samples

1 Like

I guess, it maybe missing data. I check the label file again at Sign In

The first line is indeed

Is that not the right label file?

Yes, that is the correct label file for the Particulate Matter track (but not the Trace Gas Track). The first few lines of the corresponding metadata file should look like:


Note that the granule ID timestamps won’t match up exactly with the train label data. Here’s another thread on that topic: Clarification on features' dates used for prediction - #2 by Carl_Malings

Thank you very much @cszc

1 Like