Hi, can someone explain me what’s means “true mean consumption over the prediction window under consideration” in the metric description? Thanks!
Take for example
series_id = 100454. Each measurement of consumption is made hourly in
cold_start_test.csv and you have to predict the consumption for a week. As you can see (in
daily, which means that the number you insert there is the consumption on that day
7021 100454 2016-03-22 00:00:00 14.805556 0 daily
Of course, you don’t know that number. Let’s say that the model predicts a consumption of 124568 watt-hours, but the real number is 541268 watt-hours. The number 541268 is the true mean consumption over the prediction window under consideration.
As you may know now, the measurement (in
consumption_train.csv) is almost always (I’d say always) made hourly, so in this case, on
2016-03-22, maybe the model predicts a consumption every hour (24 in total), but you have to sum those 24 measurements and insert the number there
7021 100454 2016-03-22 00:00:00 14.805556 124568 daily
Yes, I said have to sum instead of take the mean because “we include a
prediction_window column to help indicate what level of temporal aggregation we want you to predict.” Aggregation means sum.
I hope I am not mistaken about this.
Thank you so much! @rosgori, very clear.