Maybe it’s not the best place to ask, but don’t want to create a new question. So, can we use for training the NRCS and RFCs monthly naturalized flow data from the sites other than 26 used in the challenge?
Hi @dmitry_v,
I think your question is sufficiently different and notable that I split it into a separate thread.
Currently, the challenge does not have any naturalized flow data available for sites other than the primary 26. Additionally, participants are not authorized to use naturalized flow data directly obtained from NRCS or the RFCs.
I have brought your question to the challenge organizers, and they will consider whether or not to allow training on data from other sites. I will follow up here when there is an update.
Hi @dmitry_v,
Supplemental naturalized flow data for other NRCS sites for training has been made available. Please see this announcement.
Hi,
Can we also get the test set of supplemental naturilzed flow data for other NRCS sites?
Hi @jimking100,
There are currently no plans to release the supplemental naturalized flow data for test years during the Hindcast Stage. You should not train on any supplemental naturalized flow from test years, and we currently have not approved naturalized flow from other NRCS sites as features.
Hi,
I’m not training on data from test years, but the other NRCS sites could be useful proxies for those sites without monthly NRCS data. It seems odd that you would include training data for the other sties but not allow us to use it in a training feature?
@jimking100 this data is being provided as primarily additional training examples of the target variable—you can sum up months in the spring to get a seasonal water supply value for that site.
So I can’t use the individual months as the target variable in my training - correct?
Hi @jimking100,
There is no prohibition on using individual months as an intermediate target variable within your modeling approach. The submission format just requires you to be able to perform inference on held out test data where the target variable is the seasonal water supply of the 26 sites in the challenge. How you do that is up to you, subject to the requirements in the “Time and data use” section.
Hi I wanted to confirm, that upto the month of Apr the target variable is the Net Sum of (Apr - July/Jun) naturalized flow and then for next subsequent month predictions we subtract the the previous month value. Am I right?
There is drainage area missing from the metadata, is it possible to update it?
Hi I wanted to confirm, that upto the month of Apr the target variable is the Net Sum of (Apr - July/Jun) naturalized flow and then for next subsequent month predictions we subtract the the previous month value. Am I right?
That is not correct. For the 26 primary sites in the challenge, the target variable is always the total seasonal water supply. It is the same value for a given site in a given year, no matter the issue date. See this thread for additional discussion.
There is drainage area missing from the metadata, is it possible to update it?
The drainage area for the supplementary sites is not currently available, and we will likely not be able to provide it.