ERA5-Land Issues: Accessing the cdsapi Python client in the runtime environment

Hi @jitters,

Regarding Copernicus CDS credentials: you will need to use your own credentials to download data during the remote runtime environment by including them as part of your submission ZIP archive. These credentials will be stored in plain text in DrivenData’s storage backend for competition code execution. DrivenData staff—and only DrivenData staff—will be able to access your submission contents. We will only use these credentials while running your code to download data from CDS as part of your submission. If you have further questions or concerns, please let us know.

There are a few different ways you can authenticate the cdsapi client during the code execution runtime:

  1. CDSAPI_RC environment variable (probably the simplest): cdsapi supports the environment variable CDSAPI_RC for setting a path to your .cdsapirc file. This means, for example, you can include your .cdsapirc file in your, and then set os.environ["CDSAPI_RC"] = str(src_dir / ".cdsapirc") before you instantiate cdsapi.Client.
  2. CDSAPI_URL and CDSAPI_KEY environment variables: cdsapi will read these environment variables if they are set. Note that this happens when the cdsapi module is first imported, so you must set them before import cdsapi is ever run.
  3. Explicitly pass when instantiating Client: you can instantiate a cdsapi.Client with keyword arguments url and key, i.e., client = Client(url=..., key=...).

To note another gotcha: shell globs (* in shell commands) normally do not match files that begin with dots. That means if you’re doing something like zip src/* then it will not include a file like .cdsapirc. The make pack-submission command in the runtime repository uses glob, so it will not include .cdsapirc in the it creates.

Regarding approval for the monthly aggregations for ERA5-Land, I will confirm with challenge organizers and follow up.

Regarding storage capacity: the hardware specifications for the runtime nodes are available on the code submission format page. These nodes have 180 GiB of disk in total. Not all of that will be available in practice given the space needed by the operating system, but you should be able to use most of it.