Benchmark Blog Error? Validation set taken from test set

JM1000 · January 13, 2021, 5:47pm

Hello, I have been working through the magnet benchmark blog notebook (link below) and I think the validation set has been taken from the test set… i.e. they both have the same data points.

In notebook cell 19 an interim set has been created which excludes the test set. However the validation set is taken from the master data set. Given the method for splitting the set is to hold out the last 6000/3000 rows for test/val, we end up with the validation set being the same as half the test set?

If I’m wrong, and I probably am, can anyone put me straight?

If I’m right and others are re-using the code, this will likely mean the models won’t generalise well.

Any comments appreciated. Thanks.

benchmark

ngcferreira · January 14, 2021, 12:24pm

@JM1000 Thanks for pointing this out, I hadn’t noticed it. In fact I was answering saying that you were missing the fact that they were removing the used data in each of the datasets in this interim dataframe. But I didn’t noticed that they introduce a bug in get_train_test_val function:
…
test = data.groupby(“period”).tail(test_per_period)
interim = data[~data.index.isin(test.index)]

The problem is here:
val = data.groupby(“period”).tail(val_per_period) —> BUG

Should
val = interim.groupby(“period”).tail(val_per_period)
…
train = interim[~interim.index.isin(val.index)]

Hope this helps others too.

ngcferreira · January 14, 2021, 12:32pm

this might explain part of the big difference between my local test score and the leaderboard. I’m in fact calculating the test score using half of the validation data

cszc · January 14, 2021, 3:51pm

@JM1000 Thanks for catching this bug! You are indeed right, validation should be taken from the interim set instead of data. @ngcferreira’s fixes are correct. I’ll update the blog today to reflect. Thanks again to both of you!

JM1000 · January 19, 2021, 2:23pm

Hi, thanks for the responses. Glad this was sorted.

iorana · January 19, 2021, 4:02pm

Thanks a lot, great catch!

Topic		Replies	Views
How are you guys validating? Tick Tick Bloom Challenge	9	486	February 7, 2023
Cross-validation and public leaderboard MagNet: Model the Geomagnetic Field	7	819	February 8, 2021
Opening this Competition for Fun MagNet: Model the Geomagnetic Field	2	478	February 19, 2021
Submission Error MagNet MagNet: Model the Geomagnetic Field	9	666	January 14, 2021
Different results on personal test set and competition test set Mapping Disaster Risk from Aerial Imagery	3	640	December 11, 2019

Benchmark Blog Error? Validation set taken from test set

Related topics