Test and Train data


Please correct me if I am wrong.

What do you mean by, for Hindcast stage, we will be using hold-out validation with 10 years of data in the test set (odd years from 2005-2023)?

Does it mean:

  • Training set: contains data from the even years from 2005-2022 (18 years of data)
  • Test set: contains data from the odd years from 2005-2023 (10 years of data)?

The train dataset is available on the data download page.

The train dataset contains the full history available information. For some sites data is available from the 1890s.

The test dataset contains odd years for the period 2005-2023 (these years are removed from the train dataset)


Alright. Analysing to understand from the problem description provided. Thank you. Haven’t yet explored the data downloads.