Difficulty wrapping my head around the units being used and the target values for training, etc


I’ve been pondering this issue for a couple of days now, and I can’t seem to come to any resolution on a few show stopping misconceptions of mine. Hopefully the answers will be quick and easy to explain (otherwise I may not be able to score better than ~400 even when 2024 rolls around, hah).

  • The USGS streamflow data uses units of cubic feet per second (CFS). What’s the formula for turning this number into KAH – for example, assuming a trivially simple classroom exercise where the streamflow in cubic feet per second is constant and never changes, can it possibly be:
600CFS = 600*(seconds per year) / 43559.9 = Y KAF

… where CFS = cubic feet per second, KAF is thousand acre-feet, and Y is the variable of interest? My submissions are scoring erratically no matter what my local tests and validation are saying, and I can’t seem to come up with the same numbers as I see in the train.csv file provided.

  • I’m also stumbling a little trying to comprehend the periods of time involved in the relevant data. For example, a cubic foot per second has a time component, so if it’s measured at high noon, I understand 100 CFS as being 100 cubic feet of water over a time period of one second. But thousand acre-feet have no time component. Something just isn’t clicking in my head about how to interpret the various numbers and measurements available and combine them (and if I can’t do it myself, I can’t very well write a program that can do it either).

Any clarifications would be welcome. Thanks for your time, folks.

Hi @mmiron,

You’ve generally got some of the right ideas.

  • The USGS streamflow data (cubic feet per second) is a rate of water flow. You can think of the units of volume per unit time.
  • The target seasonal water supply variable is a volume of water (in KAF, which is thousand acre-feet).

So you have the right idea that you need to integrate streamflow over some period of time to get from a volumetric flow rate to a volume. If you assume a constant flow rate then your idea of multiplication is correct.

A few key things that you’re missing:

  • The target seasonal water supply value is not the total amount of water over the whole year. Each forecast site has a particular forecast season. You can find the forecast season for each site in the metadata.csv file. For most of the sites, that season is April through July. Make sure to read the “Forecasting task” section carefully.
  • Cubic feet and thousand acre-feet are two different units of volume, so you will need to convert between them as well.

So let’s say we pick a constant streamflow that is 600 CFS and a site that has a water supply forecast season of April through July. The number of days from April 1 through July 31 is 30+31+30+31 = 122 days.

Here’s the calculation. The same units are colored and are crossed out if they cancel out from being on both sides of a fraction.

Hope that helps! Also welcome anyone else to chime in with different approaches to the explanation.

1 Like

Just wanted to thank you for taking the time to help me understand, @jayqi – I’m doin’ my best to make sure your time was well spent. Thanks!