Hi all ! I was wondering what your strategies were with regards to the missing data in satellites file. I imagine it is at the very core of the problem statement. I am approaching this challenge as an opportunity to learn (and I have been reading so many papers - this field is amazing!).
I’d love to know what you have tried/looked at - and whether you are considering downscaling - if so, how (which data points are you considering for ground truths).
There are a couple of strategies I would personally try out when confronting missing data:
- The best option (in my opinion) is to build something that is robust to missing data. That is, it will use data when it is available, but when it recognizes that data are missing, it falls back to a “next best” data source to base its estimates on.
- The next option, which I would use if the missing data is a relatively small fraction of the total, is to use some sort of interpolation based on surrounding data points. We know that satellite observations are usually at least somewhat “smooth” so you can typically fill in a missing pixel by averaging or interpolating between the surrounding pixels. Of course this is not always accurate, and in some cases (e.g. cloud cover) the surrounding pixels are missing too.
- For filling in large chunks of missing data I like to fall back on “climatology”: That is, do we have data from “similar” periods in the past (e.g., the same day of the week for the previous week, or the average of the previous week) which we can use to create a picture of what the “typical” situation might look like?
These are a couple of strategies I like to use. Of course, imputing missing data in general (and satellite data in particular) is a huge area of study!