Hi there again, regarding this note:
" Your models may take advantage of the temporal data provided for each storm up to the point of prediction. Keep in mind that the goal of this competition is to produce an operational model that uses recent images to estimate future wind speeds.
slightly related and regarding the use of future test images to infer/affect previous (in time) test images predictions? can think a few ways of using those, ex: unsupervised training. Understand that for live/production this is obv invalid (we dont have future images), but here we do have and seems not be explicitly blocked for the competition the way it was setup, also prob complex to enforce? only by carefully checking all code?
Though I read this
"Keep in mind that the goal of this competition is to produce an operational model that uses recent images to estimate future wind speeds"
Competition doesnt seem about future wind, but only wind on current images (at least the stronger signal). Unless the wind label was actually diffed into the future compared to the image?
so bit Confused, Could we have this more clear?
trying to make it clear, are we allowed to use full information provided, including future images/metadata to inform/contribute to previous images?
Personally, as the setup certainly allows for this, would say should be allowed as the only way to enforce is rather exaustive check of coding for any related leakage, and even that, nothing would prevent teams to use future information on private model selection/evaluation, and not reporting that (not visible). The temporal aspect could be improved in future competition with multiple stages.
Could also be this is evaluated differently for objective vs qualitative winners. (ex: allowed for objective scores but penalizaed qualitatively), so should be also clear.
Hi @rquintino - I think you’ve answered this yourself, but just in case it’s not clear:
You are not. Per the problem description: your solution may not use images captured later in a storm to estimate the wind speeds of images captured earlier in a storm.
Hope that helps!
Just to make sure - can we use pseudo-labelling of test data to include during the training?
will be interesting to get confirmation by @glipstein
though would say from the stated above and the spirit of problem, shoudnt be allowed use of any future information, which would include pseudo-labelling future test data, also a few others like
using any test storm data (img/meta) in a way that affects others storms predictions (because temporal axis is relative, we dont have absolute time)
Hi @azkalot1 - Thanks for confirming. @rquintino is correct. You can imagine that you are running the prediction on a real storm in real time. The only permitted use of the test set is to incorporate the images in the same storm up to the time of prediction.
Some storms are shared between train and test. Can we use the label present in the train, to improve prediction in test? Since the train/test split is based on time, it would be compliant with the ask to not use precedent informations.
Hi @Brasnold - Yes you can use the training data for storms that are shared. This is set up to be in compliance with the requirement not to use any data after the point of prediction.