A bit late may be to ask this question, but I realised it now. What is the final expected outcome of this challenge between the following two:

A model/set_of_models that can be used in future issue dates (i.e. going ahead from now 2024, 2025, 2026 etc)? This model expected to be developed such that it has minimum LOOCV error across last 20 yrs.
OR

An machine learning algorithm that outputs lowest errors in last 20 yrs designed in a such a way that each year is left while training on others. The example.py given seemingly follows this route as it saves model with name having combination of site, year and quantile( line 168 f"{site}-{year}-{quantile}.joblib"). Here, e.g, model saved with year 2005 wont have any future use but it is developed in such a way that it gives minimum error for year 2005. Hence, this example code looks just like an ALGORITHM aimed to minimize LOOCV scores across all years but without any model/set_of_models for use in the future.

In my opinion, if we are doing CV then approach 1 should be the expected outcome.

The goal of the challenge is to get the most accurate modeling methodology that can lead to models that can be used operationally in the future. The cross-validation is an evaluation procedure to estimate the accuracy of a modeling approach. As such, your â€śapproach 1â€ť describes this most closely.

What you describe in â€śapproach 2â€ťâ€”producing a set of models that each individually minimize error for their respective test yearsâ€”is certainly possible to do, but would be considered overfitting. This will be assessed from your model report, and solutions that are overfitting will be considered as having weak statistical rigor.

The example code saves out the models from each cross-validation iteration mainly for reproducibility and diagnostic purposes.

The conceptual â€śfinal productâ€ť would be a model trained on all 20 years of the cross-validation period, and the cross-validation procedure is an estimate of the performance of that final model. We donâ€™t do this in the example because weâ€™re not expecting any submission based on it in the Final Stage. However, the model you submitted to the Forecast Stage is basically such a final model.