SSL Certificate Error when retrieving UA Swann CSV files

Hi @jayqi,

I’m not sure what changed in the environment but in the most recent run I am getting the following error: ssl.SSLEOFError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)

When trying to retrieve the csv files from the Arizona Swann website: https://snowview.arizona.edu/csv/Download/Watersheds/

In previous runs this worked and I am not able to replicate the issue when I run the container locally. I’ve tried a number of fixes, including using Python requests module and setting verify=False or using verify=‘path to cert downloaded from Arizona website’. Both of those work when I run locally but not in the DrivenData environment.

Please let me know how to proceed because I’m out of ideas besides removing UA Swann data from my model.

Hi @oshbocker,

Do you know if it’s happening immediately on the first request made to the UA server, or does it happen after some requests succeed? It would be helpful if you put in some logging there to check that.

Thanks @jayqi, I added some logging and it looks like it happens on the very first request which attempts to retrieve the file /csv/Download/Watersheds/17010209.csv.

There is definitely something wonky with the Arizona certificate authentication. I was getting this error locally too and was originally able to fix it using the following line of code ssl._create_default_https_context = ssl._create_unverified_context

This had been working in the DrivenData environment too up until this last run.

Hi @oshbocker,

There are two things happening here:

  1. It does indeed seem like there’s an SSL certificate issue with the University of Arizona. I am getting the same error certificate verify failed: unable to get local issuer certificate when trying to use requests both from the cluster and locally.

  2. It looks like the University of Arizona has made some kind of server/network configuration change so that https://climate.arizona.edu/snowview/ now redirects to https://snowview.arizona.edu. Our firewall was configured only to allow climate.arizona.edu, so it was blocking snowview.arizona.edu.

I’ve updated our firewall to additionally allow snowview.arizona.edu. With this firewall change, using verify=False works for me now from our cluster using either of the two hostnames:

>>> requests.get("https://climate.arizona.edu/snowview/csv/Download/Watersheds/17010209.csv", verify=False).status_code
/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:1099: InsecureRequestWarning: Unverified HTTPS request is being made to host 'climate.arizona.edu'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:1099: InsecureRequestWarning: Unverified HTTPS request is being made to host 'snowview.arizona.edu'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
200
>>> requests.get("https://snowview.arizona.edu/csv/Download/Watersheds/17010209.csv", verify=False).status_code
/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py:1099: InsecureRequestWarning: Unverified HTTPS request is being made to host 'snowview.arizona.edu'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
200

@oshbocker Since this looks to be a general issue that affects access to UA/SWANN data for other participants’ submissions, I’ve removed your personal information and I am making this thread public for overall visibility.

Thank you @jayqi I appreciate your help on this issue! My code is back to working using the verfiy=False parameter in requests.