Weighted Brier score

The competitions uses some form of multi-class weighted Brier score.

My implementation seems to be a bit off (or it could be my cross-validation which is wack). Can someone see if there is something wrong:

# get w_c for the weighted Brier formula

weights = np.genfromtxt('../data/class_weights.json', delimiter=',',
                        skip_header=1, skip_footer=1, usecols=[0])

# y is the activities annotations vector, e.g. [4,4,7,7] which we need to convert into a probability
# matrix to compare with our predictions, so I one-hot-encode it
# yp is our probabilistic predictions

def brier_score(y, yp):
    from sklearn.preprocessing import OneHotEncoder
    yy = OneHotEncoder([20], sparse=False).fit_transform(y[:, np.newaxis])
    return (1./len(yy)) * np.sum(weights * ((yy-yp)**2))

Hi rpmcruz,

The code that computes the weighted Brier score is actually given in the blog post:
http://blog.drivendata.org/2016/06/06/sphere-benchmark/

def brier_score(targets, predicted): 
    weight_vector = np.asarray(json.load(open('path/to/class_weights.json')))

    return np.power(targets - predicted, 2.0).dot(weight_vector).mean()

Hope this helps!

1 Like

Should have read that information more carefully, oops! :slight_smile: The code is equivalent though.

I see… I was one-hot encoding the annotations files, I did not notice the targets.csv, which seems to do the one-hot encoding for us and prolly averages annotations out because they are real numbers.

Yep. That’s it exactly!