Back to DrivenData | Blog

Can we use time-series model?

Can we use the history data of the same cell when inferring? I noticed that there are some cells in the submission did not exist in the tarin_label.csv. So I guess time-series model is forbidden or the history data can not obtain in the phase 2?
This is important because the label of last week is a very strong feature.

Another question: can we use the date, lat and lon? Or the image is only allowed to use.

Hi @qqerret - Thanks for the questions.

Can we use the history data of the same cell when inferring?

If you are asking whether you can use historical ground measures for the test cells, the answer is no. You can see this thread for more context. This section may help:

You are permitted to use SNOTEL and CDEC data from the set of stations contained in ground_measures_metadata.csv shared in the train features and test features files. Your solution cannot use any other historical ground measure data because your solution should be able to estimate SWE in locations where there is no historical ground station or ASO data available.

To your other question:

can we use the date, lat and lon? Or the image is only allowed to use

Can you clarify this? The information about date and grid cell location should be useful (in fact, required) for cross referencing with approved features, so I think what you’re asking is fine.

Hope that helps!

If you are asking whether you can use historical ground measures for the test cells, the answer is no.

No, I mean we have to predict the whole range of 2020-2021 in Phase I. I know we can’t use the future data, but should we get the label of 2020-01-07 when predicting the 2020-01-14 in Phase II? It’s a strong feature.

Can you clarify this? The information about date and grid cell location should be useful (in fact, required) for cross referencing with approved features, so I think what you’re asking is fine.

That’s helpful!

I’m still not sure if I’m understanding the question. You cannot use any future data past the day of prediction, and you cannot use any labels for the test set grid cells as features when making predictions. In Stage 2, you will be submitting SWE predictions each week using the model you trained in Stage 1. In Stage 2b, you will not be able to retrain your model or update your model weights; you will simply be applying your existing model onto a new data set.

Does that help? It may be helpful to read over the Stage 2 challenge description once it’s published next week.

Hello @glipstein, does your last answer mean you cannot improve your model during stage 2a? The description indicates you can:

Stage 2a: Submission Testing (Jan 11 - Feb 15) : Package everything needed to perform weekly inference on a set of 1km grid cells. Use this stage to make any final model improvements

@Galeros93 Good question, no you can still make improvements during Stage 2a. The previous response is updated to clarify.

In Stage 2b, you will not be able to retrain your model or update your model weights; you will simply be applying your existing model onto a new data set.

I updated the question: Can we get the label of 2020-01-07 of the same cell as a feature when predicting the 2020-01-14 in Phase 2b?

Hi @qqerret - Thanks for clarifying. This is what I thought you were asking originally, so the first response is correct. The answer is no. As described on the site, only ground measures from grid cells listed in the test features file (which are distinct from the submission grid cells) are permitted during inference. Again, the idea is that a solution should be able to estimate SWE in locations with no ground station data available.

1 Like

THANK YOU!

the idea is that a solution should be able to estimate SWE in locations with no ground station data available

That’s the point we ask for.