Clouds in dataset

fnands · November 28, 2022, 9:25pm

In my opinion, figuring out how to deal with bad images is going to be what makes or breaks a great solution for this challenge.

See @hengcherkeng’s post here.
My understanding is he basically uses a transformer to choose the best time step, and then takes that as a prediction.

The way I see the problem is, is that there are 12 possible time steps, and the difference between these time steps can be very informative, i.e the way vegetation behaves in winter vs. summer can be quite informative as to how much biomass there is at a location (also, see the comment in the post above about snow).

The hard part is that you often don’t have all 12 images, or the images might be cloudy, or otherwise degraded. So the question is, how do we deal with a sequence with missing data? On the one hand, this reminds me of how transformers like BERT are trained.

So I’m thinking some kind of model that sees the entire series and then predicts an output, even with missing steps.

I wonder if some kind of SSL pre-training would help?

Topic		Replies	Views
[algorithm discussion] beware! it is a temporal-spatial input problem The BioMassters	4	919	November 15, 2022
Care to share general methodologies? Snowcast Showdown	18	570	June 5, 2022
Training the model The BioMassters	0	400	December 13, 2022
Feel stuck? Ideas that might help you to procced On Cloud N	0	544	January 12, 2022
How to work with huge datasets like this one? The BioMassters	1	413	December 12, 2022

Clouds in dataset

Related topics