Seeing as no Python baseline was provided, I put together quick notebook using pytorch to get started.
You can find the notebook here.
It’s just a simple UNet that takes a single Sentinel-2 image as input.
I tried using TorchGeo, although only ended up using the pre-trained model weights from there.
The notebook includes a (somewhat inefficient) prediction step, and scored ~56 after 20 epochs of training.
some baseline (using a single unet model with encoder complexity same a se-resnext26d, no ensemble)
just learn segmentation per image. at submission average over all 12 months without considering if the input is good or bad : lb score of about 50.0
learn segmentation per image. use max/softmax pooling at the head (select the best out of 12 months): lb score 35.0
better selection with a transformer (e.g. at encoder or/and decoder): lb score of 31.0
choose a better loss than MSE (because there are spike noise in ground truth lidar): lb score of 29.0
The Torchgeo dataloaders are cool, it allows you to very easily change the patch size and get samples on the fly. However, to benefit from this functionality you would first need to mosaic the provided patches. I am currently experiencing projection issues with the data.
Yeah, I think torchgeo is great for the general case where you are reading from a large satellite image/raster, and want to pick sensible patches, especially if multiple rasters are involved, but as the images here are already patched sensibly, it’s not that useful.
Nice to see you here heng! May you explain what “use max/softmax pooling at the head” means? It means to aggregate all the months by using the maximum instead of the mean? or it means to tweak the architecture replacing the segmentation head with a max pool layer? (not only for testing but also for training)?