Back to DrivenData | Blog

About the Cold Start Energy Forecasting category

This category is for posts about the Power Laws: Cold Start Energy Forecasting competition. Please keep all posts here specific to the competition.

How is the winner selected? The best score on private leaderboard? Or do we have to select one submission?


In the consumption_train data, what does temperature column exactly mean?
And what does base_temperature low or high for each series in the meta data exactly mean?

@ironbar Winner is the best score on the private leaderboard.

@vamsy19 The temperature column is the outdoor temperature measured at the nearest weather station. The base_temperature is whether or not the setting for the building’s system is generally low (cooler) or high (warmer) that a certain threshold.

What happens to the evaluation metric if the ground truth is 0 (seems to exist in training set)?

We just have to predict that consumption is daily, hourly or weekly ? Is that correct ?

In this case we would offset from zero by a very small epsilon.

The submission format indicates what predictions we expect. For those time periods, you must predict the amount of consumption.

Thanks. Just to be sure, you are only offsetting in the denominator of c_i=w_i/m_i, right?

Yep. Then the true value will be close, but not equal to zero.

Hmm… still a bit confused. Could you write down the evaluation formula with epsilon in it?

I think it matters because as I understand it now, getting one of those small values even a little bit wrong can be quite literally disastrous, even though they only account for a negligible fraction of the values.


Okay! Sounds good. One more query, even in the submission format, the temperature data is missing for some entries. That means, temperature need not be predicted, right ?

Correct, you only need to predict consumption.


I have two more questions related to leaderboard:

  1. Which percent of the test set belongs to public?
  2. The split between public and private test set is random?


1 Like

In our submission, for the daily and weekly predictions, are we supposed to predict the average kW over that period or the summed kWh? So, if the average usage is 20 kW per hour, is the correct prediction on a daily entry 20 or is it 20 times 24, 480?

@gijskoot - answered here Consumption Aggregation Types