My solution (2nd place so far)

n01z3 · January 27, 2020, 6:58pm

I used a dataset from Pavel Pleskov, which was reduced to 512 on the wide side. He used PIL and ANTIALIAS interpolation.
What did not work in this competency:

All other ways to read images (opencv, jpeg4py, dali), except for those used by Pavel.
I got my first result from using dali data loader and was very happy. But then I was struggling to achieve the same local score on LB.
Sampling.
I tried to take rare classes more often, emptiness less often. The score worsened. I also tried to speed up the whole train and throw 70% of the easiest samples, where the loss is already almost 0, also did not work.
Imagenet-style RandomCrop.
Which default in torchvision and dali. I broke my brain to understand how to choose the parameters in order to set the same scale as it would be for the test. As a result, I switched to Albumentations and everything turned out.
Small resolution.
For a very long time (like a couple of days) I experimented with the size of the input image in the region of 224 and could not break through the loss of 0.0040. Then I increased it and everything worked out.

What worked in and how my solution looks like:

swsl_resnext50, wsl_resnext101d8. The first convolution and the first BN are frozen during all stages of training.
pytorch-lightning, apex O1, distributed
WarmUp, CosineDecay, initLR 0.005, SGD, WD 0.0001, 6-8 GPUs, Batch 256 per GPU
loss / metric torch.nn.MultiLabelSoftMarginLoss
Progressive increase in size during training. Wide side resize: 256 -> 320 -> 480
During training, resize to ResizeCrop size on the wide side -> RandomCrop with ResizeCrop / 1.14 size. Moreover, the crop is not square, but rectangular with the proportion of the original image. During inference, resize to ResizeCrop and that’s it.
From augmentations: flip, contrast, brightness. With default parameters from albumentations
TTA: flip
Averaging within one series - gmean
TTA prediction and model averaging - gmean

BGU_DS · February 1, 2020, 12:21pm

Nice work.
Thanks for sharing!

brismith · February 3, 2020, 6:22pm

Excellent - well done @n01z3 and thanks for sharing your approach.

bull · February 5, 2020, 6:25am

Thanks for sharing!

Topic		Replies	Views
My solution and code Hakuna Ma-data	2	625	February 5, 2020
Solutions postings Clog Loss: Advance Alzheimer’s Research	4	842	August 12, 2020
Getting Started Segmenting Buildings for Disaster Resilience	32	4963	February 26, 2020
Kelp Wanted Competition - 2nd place solution Kelp Wanted: Segmenting Kelp Forests	4	318	February 28, 2024
Loss function weights Mapping Disaster Risk from Aerial Imagery	0	556	November 21, 2019

My solution (2nd place so far)

Related topics