This post is a short description of my final approach in this competition. I tried a lot of different ideas, which I’m going to describe in detail in a paper/blog post on medium, but 90% of success is due to:
Min_max scaling within ‘series_id’, del all targets that have more than 4 constant values in sequence (I guess those are missing values that were replaced by median). Fill missing temperatures with hourly mean / month mean using only train data.
Categorical features from timestamp: year, month, day, day of year, hour of year, day of week, hour of day. Those categories transformed with sin/cos transformation and used as numerical features.
Additional features: is_day_off, is_next_day_off, type of building (sum of day_off columns as string), series_id as category.
10 StratifiedKFold on series_id
FF NN, 5 layers, 512 neurons in each layer, relu activation. Adam optimizer, mae loss (mape, rmse gave worse results).
Two weeks ago I found a bug in my code: I was predicting the consumption 24-336 hours backward. It ruined days and weeks, but hours were ok (I was in top10 with 3307 score). After fixing this bug, I became top1 with a gap to the 2nd place, but it seems for me that I slightly overfitted the leaderboard and after a shakeup end up on 4th place.
Graz to the winners and thank you all for the competition. Good luck in next events!