Data Reading Error

I tried to read the cpc_outlook data using the git repo (GitHub - drivendataorg/water-supply-forecast-rodeo-runtime: Data and runtime repository for the Water Supply Forecast Rodeo competition on DrivenData), I received an error. The error appears with some of the data like 2005-01-08, 2005-01-15, etc. Is the data downloaded formatted wrong? BTW I’m using this same repo to download the data too.

from wsfr_read.climate import cpc_outlooks
from wsfr_read.streamflow import usgs_streamflow
from wsfr_read.teleconnections import mjo, oni, pdo, pna, soi


cpc_outlooks.read_cpc_outlooks_precip('2005-01-01', 'hungry_horse_reservoir_inflow')

TypeError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in get_loc(self, key)
3789 try:
→ 3790 return self._engine.get_loc(casted_key)
3791 except KeyError as err:

index.pyx in pandas._libs.index.IndexEngine.get_loc()

index.pyx in pandas._libs.index.IndexEngine.get_loc()

TypeError: ‘[20, 21]’ is an invalid key

During handling of the above exception, another exception occurred:

InvalidIndexError Traceback (most recent call last)
13 frames
InvalidIndexError: [20, 21]

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/multi.py in _get_level_indexer(self, key, level, indexer)
3287 if not locs.any():
3288 # The label is present in self.levels[level] but unused:
→ 3289 raise KeyError(key)
3290 return locs
3291

KeyError: 20

Hi @Chinchpokli, I just made an update that I think should address this problem. (commit)

Let me know if you continue to run into issues.

For whatever reason, the 2004 precipitation data file has a slightly different format than all of the other ones. :sweat:

1 Like

Sorry for asking again, year 2006 also seems to have different format. You may confirm for all the years:

meta_data = pd.read_csv('metadata_TdPVeJC.csv')
site_ids = meta_data.site_id.unique()

for site in site_ids:
    for year in range(1995, 2024):    
        try:
            _ = cpc_outlooks.read_cpc_outlooks_precip(str(year), site)
        except:
            print(site, year)

for temperature files these years give error:
stehekin_r_at_stehekin 2010
stehekin_r_at_stehekin 2015
stehekin_r_at_stehekin 2017
stehekin_r_at_stehekin 2019
detroit_lake_inflow 2019

Hi @Chinchpokli,

I’ve made an update to address 2006 for the precipitation data.

I am not able to reproduce your errors with the temperature files. Do you get the same KeyError from your first post, or a different error?

In [3]: cpc_outlooks.read_cpc_outlooks_temp('2019-03-01', 'stehekin_r_at_stehekin')
Out[3]: 
                               R    98.    95.    90.    80.    70.    60.    50.    40.    30.    20.    10.     5.     2.  F MEAN  C MEAN    F SD  C SD
issue_date YEAR MN LEAD CD                                                                                                                               
2018-11-14 2018 10 1    74  0.30  29.66  30.65  31.51  32.56  33.33  33.98  34.57  35.20  35.84  36.61  37.66  38.53  39.51   34.57   33.30  2.3957  2.51
                        75  0.32  35.99  36.60  37.14  37.80  38.28  38.68  39.06  39.45  39.85  40.33  40.99  41.53  42.14   39.06   38.26  1.4971  1.58
                   2    74  0.25  28.73  29.76  30.66  31.77  32.58  33.26  33.88  34.54  35.22  36.02  37.13  38.03  39.07   33.88   32.39  2.5152  2.60
                        75  0.19  35.16  35.84  36.44  37.17  37.70  38.15  38.56  38.99  39.44  39.97  40.70  41.30  41.98   38.56   37.57  1.6580  1.69
...
                   12   74  0.23  32.81  33.76  34.59  35.60  36.34  36.96  37.54  38.14  38.77  39.50  40.52  41.35  42.30   37.54   36.70  2.3089  2.37
                        75  0.23  36.72  37.49  38.16  38.99  39.59  40.10  40.57  41.05  41.56  42.16  42.99  43.66  44.43   40.57   39.88  1.8780  1.93
                   13   74  0.17  39.52  40.22  40.84  41.60  42.14  42.61  43.04  43.48  43.95  44.49  45.25  45.87  46.57   43.04   42.42  1.7145  1.74
                        75  0.33  40.49  41.13  41.70  42.39  42.89  43.31  43.70  44.11  44.53  45.04  45.73  46.29  46.93   43.70   43.11  1.5681  1.66