For those who are still struggling to write code (in Python) to find the performance metric, here is my help (read all the code!):
def metric(y_true, y_pred, season=None):
"""
It computes the score for a specific id and for a specific prediction
window. This function doesn't check if the two arguments have the same size
and if both have the same timestamp. The predictions in y_pred are made
hourly.
Args:
y_true::DataFrame from pandas, Series from pandas
DataFrame with one column or more (it has to include consumption);
consumption has the true values.
y_pred::DataFrame from pandas, Series from pandas
DataFrame with one column or more (it has to include consumption);
consumption has the predicted values.
season::str
There are three options: daily, weekly and hourly.
Returns:
score::float
It gives the score for a specific id
"""
window = {'daily': 24 / 7, 'weekly': 24 / 2, 'hourly': 24 / 24}
if season == 'hourly':
print(window.get('hourly'))
ci = window.get('hourly') / y_true.consumption.values
score = np.mean(np.abs(y_true.consumption.values -
y_pred.consumption.values) * ci)
if season == 'daily':
print(window.get('daily'))
div = y_pred.consumption.values.shape[0] / 24
y_pred_div = np.split(y_pred.consumption.values, div)
y_true_div = np.split(y_true.consumption.values, div)
last_sum = []
for i, array in enumerate(y_pred_div):
ci = window.get('daily') / np.sum(y_true_div[i])
print(ci)
last_sum.append(np.abs(np.sum(y_true_div[i]) -
np.sum(array)) * ci)
score = np.mean(last_sum)
if season == 'weekly':
print(window.get('weekly'))
div = y_pred.consumption.values.shape[0] / 168
y_pred_div = np.split(y_pred.consumption.values, div)
y_true_div = np.split(y_true.consumption.values, div)
last_sum = []
for i, array in enumerate(y_pred_div):
ci = window.get('daily') / np.sum(y_true_div[i])
print(ci)
last_sum.append(np.abs(np.sum(y_true_div[i]) -
np.sum(array)) * ci)
print(last_sum)
score = np.mean(last_sum)
return score
You have to modify this function if you want the score for all id
together.