A quick and dirty pytorch baseline

fnands · November 13, 2022, 2:55pm

Seeing as no Python baseline was provided, I put together quick notebook using pytorch to get started.

You can find the notebook here.

It’s just a simple UNet that takes a single Sentinel-2 image as input.

I tried using TorchGeo, although only ended up using the pre-trained model weights from there.

The notebook includes a (somewhat inefficient) prediction step, and scored ~56 after 20 epochs of training.

hengcherkeng · November 15, 2022, 1:32pm

some baseline (using a single unet model with encoder complexity same a se-resnext26d, no ensemble)

just learn segmentation per image. at submission average over all 12 months without considering if the input is good or bad : lb score of about 50.0
learn segmentation per image. use max/softmax pooling at the head (select the best out of 12 months): lb score 35.0
better selection with a transformer (e.g. at encoder or/and decoder): lb score of 31.0
choose a better loss than MSE (because there are spike noise in ground truth lidar): lb score of 29.0

Geethen · November 16, 2022, 10:27am

The Torchgeo dataloaders are cool, it allows you to very easily change the patch size and get samples on the fly. However, to benefit from this functionality you would first need to mosaic the provided patches. I am currently experiencing projection issues with the data.

fnands · November 16, 2022, 10:50am

Yeah, I think torchgeo is great for the general case where you are reading from a large satellite image/raster, and want to pick sensible patches, especially if multiple rasters are involved, but as the images here are already patched sensibly, it’s not that useful.

cayala · November 18, 2022, 7:59pm

Nice to see you here heng! May you explain what “use max/softmax pooling at the head” means? It means to aggregate all the months by using the maximum instead of the mean? or it means to tweak the architecture replacing the segmentation head with a max pool layer? (not only for testing but also for training)?

lewfish · December 1, 2022, 12:28am

Using weights from a model trained on another Sentinel dataset is a good idea, but my understanding is that it would violate the contest rules, since you are not allowed to use any external data. See Competition: The BioMassters

fnands · December 1, 2022, 7:54am

Hey @lewfish,
On the problem description page it says:

Participants can use pre-trained computer vision models as long as they were available freely and openly in that form at the start of the competition.

If not, no-one would be able to use models trained on imagenet etc.

Topic		Replies	Views
[algorithm discussion] beware! it is a temporal-spatial input problem The BioMassters	4	923	November 15, 2022
Another PyTorch Workflow Example (RMSE=37) The BioMassters	0	707	December 1, 2022
Clouds in dataset The BioMassters	6	651	November 29, 2022
Getting Started Segmenting Buildings for Disaster Resilience	32	4963	February 26, 2020
Introducing PytorchEO On Cloud N	2	544	December 22, 2021

A quick and dirty pytorch baseline

Related topics