Data Extraction and Processing Times

Hello, just for info, i would like to know, on average, how long it’s taking you to download satellite data and processing it into the correct format for it be ready to feed into the model as features. Thank you!

Hi @kovila. I can’t speak for others, but our benchmark notebooks take about 1-1.5 hours to run end to end:

In addition to code for downloading and processing data, the notebooks include code for training a rudimentary model. They only process MAIAC and OMI data for the pm2.5 and NO2 blog posts, respectively. I imagine TROPOMI will take the longest to download and process given the size of the dataset. FWIW, using a machine in the same region as the bucket greatly increases the speed of download.

1 Like

Where are you getting satellite data from?
Send me the links and procedure to download
Also, send me the train_labels.csv file with SWE values for 11K cell_ids. Could not find it.


Hi @vkmr123 . Instructions for accessing the satellite data for the NASA Airathon competitions can be found on the data download pages (no2 and pm2.5).

You can direct SWE related questions to the Snowcast Showdown - DrivenData Community topic. Posts not related to the Airathon competitions posted under this topic will be deleted.