Back to DrivenData | Blog

Code for metric

For those who are still struggling to write code (in Python) to find the performance metric, here is my help (read all the code!):

def metric(y_true, y_pred, season=None):
    """
    It computes the score for a specific id and for a specific prediction 
    window. This function doesn't check if the two arguments have the same size
    and if both have the same timestamp. The predictions in y_pred are made 
    hourly.
    
    Args:
        y_true::DataFrame from pandas, Series from pandas 
            DataFrame with one column or more (it has to include consumption);
            consumption has the true values.
        
        y_pred::DataFrame from pandas, Series from pandas
            DataFrame with one column or more (it has to include consumption); 
            consumption has the predicted values.
            
        season::str
            There are three options: daily, weekly and hourly.
        
    Returns:
        score::float
            It gives the score for a specific id
        
    """
    
    window = {'daily': 24 / 7, 'weekly': 24 / 2, 'hourly': 24 / 24}
    
    if season == 'hourly':
        print(window.get('hourly'))
        ci = window.get('hourly') / y_true.consumption.values
        score = np.mean(np.abs(y_true.consumption.values - 
                                y_pred.consumption.values) * ci)
    
    if season == 'daily':
        print(window.get('daily'))
        
        div = y_pred.consumption.values.shape[0] / 24
        y_pred_div = np.split(y_pred.consumption.values, div)
        y_true_div = np.split(y_true.consumption.values, div)
        last_sum = []
        for i, array in enumerate(y_pred_div):
            ci = window.get('daily') / np.sum(y_true_div[i])
            print(ci)
            last_sum.append(np.abs(np.sum(y_true_div[i]) - 
                                   np.sum(array)) * ci)
        
        score = np.mean(last_sum)
    
    if season == 'weekly':
        print(window.get('weekly'))
        
        div = y_pred.consumption.values.shape[0] / 168
        y_pred_div = np.split(y_pred.consumption.values, div)
        y_true_div = np.split(y_true.consumption.values, div)
        last_sum = []
        for i, array in enumerate(y_pred_div):
            ci = window.get('daily') / np.sum(y_true_div[i])
            print(ci)
            last_sum.append(np.abs(np.sum(y_true_div[i]) - 
                                   np.sum(array)) * ci)
            print(last_sum)
        
        score = np.mean(last_sum)
        
    return score

You have to modify this function if you want the score for all id together.

5 Likes