Date of monthly naturalized streamflow

EmilRyd · December 3, 2023, 11:24am

Hi,

We have a question regarding what date the naturalized monthly streamflow data is generated/published. To be specific, if I take the naturalized monthly streamflow data for a given basin for the month of March, on what day specifically did that data come out, and what days does it cover (is it calendar months, e.g. March 1 - March 31, or 30 days back from when the streamflow data was published)? Thank you!

jayqi · December 4, 2023, 3:42pm

Hi @EmilRyd,

To be specific, if I take the naturalized monthly streamflow data for a given basin for the month of March, on what day specifically did that data come out, and what days does it cover (is it calendar months, e.g. March 1 - March 31, or 30 days back from when the streamflow data was published)?

The monthly naturalized streamflow is the total naturalized streamflow for that calendar month. So the March value is the sum over March 1–March 31, the April value is the sum over April 1–April 30, etc.

what date the naturalized monthly streamflow data is generated/published

Unfortunately, we do not have any record of when the data is published or modified. For the Hindcast Stage, we are making the simplifying assumption that the data is available once the month is over, e.g., the March data becomes available on April 1. In the Forecast Stage, the data will become available whenever it is published by the data providers. We recommend that you set up your model in a way that will work flexibly using whatever latest data is available.

mmiron · December 28, 2023, 5:11pm

Hi again @jayqi,

Am I right in assuming that throughout the forecast period, “test_monthly_naturalized_flow.csv” will – in the same exact manner and format as during the hindcast stage – always have antecedent flow values for the 23 sites for which it’s available?

That it is to say, when our code is executed on Feb 1st 2024 for a prediction issue_date of Feb 1st 2024, test_monthly_naturalized_flow.csv from the data_dir directory (i.e. /code_execution/data/test_monthly_naturalized_flow.csv) will contain antecedent flow values from Oct 2023 through Jan 2024 for the same 23 sites as the hindcast stage (within the limitation of that data not being released yet)?

I’m just double checking to be sure I don’t need to modify that for the Forecast stage, and can leave it as-is from the Hindcast stage.

jayqi · December 31, 2023, 5:24pm

Hi @mmiron,

Yes, that is generally correct. For each issue date, the mounted data drive will contain a test_monthly_naturalized_flow.csv file in the same format with whatever data is available as of that date. Note that we don’t have control over when the data is available from NRCS, so some or all sites might not have January 2024 available immediately on February 1, 2024.

rasyidstat · December 31, 2023, 9:01pm

Hi @jayqi, does it mean that if the data is not available for some sites, in a particular month, it will not have exactly 23 records? Or there will be 23 records but some values will be missing?

jayqi · January 2, 2024, 7:00pm

Hi @rasyidstat,

Currently, for sites where some months have observations but others do not, the months that are missing values will be empty (NA if you’re reading with pandas). Sites that have no observations at all will not have any rows included.

tabumis · January 4, 2024, 4:31pm

Hi @jayqi,

could you please clarify if the test_monthly_naturalized_flow.csv will only contain data for 2024 forecast year. Will it also include historical observations like in the hindcast stage?

jayqi · January 4, 2024, 4:40pm

Hi @tabumis,

That is correct. It only contains data expected to be use at inference time for the current water year.

For the Hindcast Stage, it only contained data for the water years in the test set.

If you need data from other years, e.g., to have a longer lookback window when deriving features, you should explicitly make a request with details (e.g., how far back). See the blue info box under “Time and data use” for the Forecast Stage.

tabumis · January 4, 2024, 5:05pm

Thanks for the clarification @jayqi .Ive made a request in this thread

Btw, does this also refer to files in the teleconnections folder - will they also contain observations relevant to the current water year only?

jayqi · January 4, 2024, 5:17pm

Hi @tabumis,

No, the teleconnections files contain all historical data available. We don’t do anything to subset those datasets.

Topic		Replies	Views
Antecedent monthly naturalized flow and no future data requirements Water Supply Forecast Rodeo	3	271	March 19, 2024
Use of test monthly naturalized flow Water Supply Forecast Rodeo	4	188	December 31, 2023
Training Data - Monthly vs. Ground Truth Water Supply Forecast Rodeo	3	395	November 10, 2023
Naturalized flow data for sites other than the 26 Water Supply Forecast Rodeo	11	327	December 19, 2023
Natural flow calculations Water Supply Forecast Rodeo	1	369	November 2, 2023

Date of monthly naturalized streamflow

Related topics