Import issues

I’m having problems with imports that work locally but fail remotely.
Specifically, I can’t find a workaround for this import:

from wsfr_download.utils import site_geospatial_buffered

In contrast, I was able to find a workaround for cpc that worked both locally and remotely:
try:
from data_reading.wsfr_read.climate.cpc_outlooks import read_cpc_outlooks_temp, read_cpc_outlooks_precip #works on my. machine

except:
from wsfr_read.climate.cpc_outlooks import read_cpc_outlooks_temp, read_cpc_outlooks_precip # #works remotely

Hi @scottmreed,

I have made your message into a public thread, as it is a general question. Please post general questions in the forum instead of privately messaging me, so that other participants may also see responses.

The wsfr_download package is intentionally not available in the runtime environment. It is used by DrivenData to populate the mounted data volume, and it is not installed in the runtime because solutions should not be using it to redundantly download any data.

If you need the site_geospatial_buffered function, then I recommend that you just copy-and-paste it into your own code.


The fact that you are using from data_reading.wsfr_read.climate.cpc_outlooks import suggests to me that you are not actually testing your solution with Docker. When testing your solution locally, I strongly recommend you use Docker so that you are doing the same thing as what the code execution runtime is doing.

If you really want to test outside of Docker, then I recommend you create a conda environment where you install all of the packages in the same way as the Docker container. You can see how we install dependencies in the Dockerfile. Either way, wsfr_read should be an installed package and using a local path like data_reading.wsfr_read to import it is a sign that you are not treating things like the runtime setup.

Thanks. Recreating site_geospatial_buffered was easy enough.

Within docker I get a different (unrelated) error:
No module named ‘dataretrieval’ even though I’m using the same version of dataretrieval locally as is listed on environment-cpu.yml/environment-gpu.yml but it only works locally and not on docker. Any suggestions?

Also, I’m only using data_reading.wsfr_read.climate.cpc_outlooks locally and data_reading.wsfr_read.climate.cpc_outlooks remotely. That’s why they are in a try/except. I don’t know what happened or how I got there but it works!

The dataretrieval package was added to the runtime environment on December 12. Make sure your local Docker image is up to date. There have been many updates since it was first available.

Thanks. The updated docker image solved that.

On docker I’m able to test my preprocess function but not my solution function because the docker image is still running the odd-year hindcast stage requests.

My solution.py if super, very specific to water year 2024 to eliminate any risk of leakage of data from prior years. I’d rather not remove those protections to make the script more general. So I need to do my final testing by uploading a zip? Or am I missing something? Like a way to run 2024 on docker?

Also, it wasn’t clear to me which url was for a “User-uploaded submission” and which was for an “Admin-scheduled submission”: submissions/code/ or competitions/259/reclamation-water-supply-forecast/submissions/

Or are they the same and if a user-uploaded submission works it will be re-run on Jan 8 as an admin scheduled submission? That’s probably it. The second URL seems to redirect to the first.

Hi @scottmreed,

When you run the Docker container locally, it is mounting the local pathtorepo/water-supply-forecast-rodeo/data/ directory into the container. This simulates the mounted data volume present during runtime. The data isn’t part of the Docker image, and you need to set up the data appropriately. You are seeing data from the Hindcast Stage because you had put it there when you were working on the Hindcast Stage and haven’t replaced it with Forecast Stage data.

  • Documentation about the data directory is here.
  • Download bulk config files for the Forecast Stage are in the data_download/ directory. Make sure you’ve pulled the last commits for the repository.
  • You can also instead directly download the exact files that we have in the mounted data volume for the Forecast Stage per instructions here

if a user-uploaded submission works it will be re-run on Jan 8 as an admin scheduled submission

Yes, this is correct.

My site_geospatial_buffered function works locally and on docker but fails on my code submission. Any suggestions?

Hi @scottmreed ,

We took a look, and it was indeed an issue with the code submission runtime. The runtime was blocking network access to the CDN that serves up projections, which caused site_geospatial_buffered to fail. That also explains why the submission worked locally (the local runtime allows all network access).

We updated the runtime to allow access to that CDN, and confirmed that the code works now. Apologies for the error, and let us know if you have any other issues.

Thanks. That’s a relief.

The other difference I’m seeing is that on my last zip submission I logged errors at line 288 and 309 of my preprocess.py. These errors come after calling read_cpc_outlooks_temp and read_cpc_outlooks_precip from climate.cpc_outlooks and behave as excepted locally and on docker. Any chance there is a similar fix there?

I could dig in some more on my end to narrow it down assuming that I can do more than 3 csv submissions of code today. But there are only 7 lines of code between the CPC read and the exception and none look suspect.

I can’t say for sure without looking more closely if the recent fix will address that error. Since the submission limit only counts successful submissions, you can try submitting again without risking hitting the submission limit if it fails.

Thanks. I think I have it all sorted out now.