Solutions postings

Moshel · August 6, 2020, 8:09am

as this is my first driven data competition, I don’t know if the solutions are usually posted. I would love to see what zfturbo did to get twice my score!

ZFTurbo · August 9, 2020, 3:10pm

I currently preparing solution write-up for organizers. I also plan to publish small solution writeup on arxiv after that (I will post link in this thread if you interested).

Code will be posted on drivendata github a little bit later, like it was in previous competitions.

My solution is based on these 2 libraries I prepared and already posted:

The best model for me was DenseNet121 3D with (96, 128, 128, 3) input shape and batch size equal to 6. I used large dropout 0.5 at the classifcation layers to prevent overfitting. I started with imagenet weights which I converted for 3D variant of nets.
I trained only on ROI part of videos, which I extracted using your code from forum: Python code to find the roi - #4 by Shanka ))
Batches generated in proportion 25 (stalled==1) / 75 (stalled == 0)
I validate using MCC from begining and then switched to ROC AUC. My validation score was around 0.96-0.98 ROC AUC.
I started with micro dataset and then increase number of used videos up to ~50K (using all available stalled == 1)
Last trick which allows me to increase score from 0.82 to 0.86 on public LB was to finetune only on tier1 data (looks like test set contains only tier1 ???)
I applied augmmentations with volumentations library which I remade a little bit to increase speed and add some more useful augs.
I used 5KFold cross-validation. My validation MCC score wasn’t the same as LB, but direction was similar. Increasing at local gives me better result on LB.
Loss function: binary crossentropy. Optimizer: AdamAccumulate
I choose THR to output binary probabilities using leaderboard (so there was some chance to overfit on LB). I found out the optimal number of stalled videos in test set was around 600-700.

Moshel · August 9, 2020, 9:06pm

Thank you! I went the LSTM way, was probably the wrong decision.

ehayes · August 11, 2020, 9:53am

Hi,
Congratulations on first place!

I hope you wouldn’t mind answering a few questions on your solution?
Whats the intuition behind using unbalanced batches? I’ve not seen that approach used before!
Did you/How did you decide on a threshold for the test dataset?
And how did you convert 2d imagenet weights to 3d weights? I’ve had a look at the code but I can’t work out what you’re doing! (I’m less familiar with keras).

Thanks

ZFTurbo · August 12, 2020, 6:34pm

Congratulations on first place!

Thank you )

Whats the intuition behind using unbalanced batches? I’ve not seen that approach used before!

In early stage of competition I found out that test set is very imbalanced. It has around 20 times more stalled=0 in it. So by generating inbalanced mini-batches I tried to make neural net more “optimistic”, e.g. predict “1” only if it really sure video has label stalled==1. I’d probably try 1 to 20 ration in batches, but I was limited by batch with size equal to 6, which I used. And I didn’t want to have “only zeros” batches during training.

Did you/How did you decide on a threshold for the test dataset?

In the begining I tried different THRs and check LB to find which is better. At the latest stages I choose THR to keep number of stalled=“1” in submission.csv in range ~600-700.

And how did you convert 2d imagenet weights to 3d weights? I’ve had a look at the code but I can’t work out what you’re doing! (I’m less familiar with keras).

My answer on this question was big enough that It’s better to read PDF I prepared.

Topic		Replies	Views
Official pre-trained models/external data thread Clog Loss: Advance Alzheimer’s Research	3	1290	August 3, 2020
My solution (2nd place so far) Hakuna Ma-data	3	1080	February 5, 2020
Approaches and solutions The Metis Challenge: Naive Bees Classifier	4	2351	December 21, 2015
My solution and code Hakuna Ma-data	2	625	February 5, 2020
Share the knowledge Pover-T Tests: Predicting Poverty	29	2736	March 4, 2018

Solutions postings

Related topics