Clouds in dataset

In my opinion, figuring out how to deal with bad images is going to be what makes or breaks a great solution for this challenge.

See @hengcherkeng’s post here.
My understanding is he basically uses a transformer to choose the best time step, and then takes that as a prediction.

The way I see the problem is, is that there are 12 possible time steps, and the difference between these time steps can be very informative, i.e the way vegetation behaves in winter vs. summer can be quite informative as to how much biomass there is at a location (also, see the comment in the post above about snow).

The hard part is that you often don’t have all 12 images, or the images might be cloudy, or otherwise degraded. So the question is, how do we deal with a sequence with missing data? On the one hand, this reminds me of how transformers like BERT are trained.

So I’m thinking some kind of model that sees the entire series and then predicts an output, even with missing steps.

I wonder if some kind of SSL pre-training would help?

1 Like