Number 1 Solution Writeup

Hi all,

Here is our solution writeup for the Kelp Wanted: Segmenting Kelp Forests competition. For us, as a student team from the TU Delft, this was our second competition which we enjoyed a lot and you can read our report about our solution here:
Technical report: iv-q2-detect-kelp/Detect_Kelp___Technical_Report.pdf at main · TeamEpochGithub/iv-q2-detect-kelp · GitHub
Open GitHub code solution: GitHub - TeamEpochGithub/iv-q2-detect-kelp

We are looking forward to your solution as well!

Kind regards,

Team Epoch IV


thanks for the writeup !!!

what do you think is the killer point of your solution?

i would think that it could be the use of feature engineering NDVI,NDWI,… and selection by xgboost. do you have results with and without feature engineering?

(i skip this step because i thought the test and train data come from the same location, Falkland Islands, so neural nets can automatically learn the combinations of the channels)

on a side note that, about 100 train images and 20 test images are mis-aligned. you can check this by drawing the DEM outline on rgb and nir images.

it seems that the kelp ground truth is aligned with the DEM, while rgb, nir, cloud aligned with each other.

you can get better training data (and a more accurate model) if you correct the alignment

Thank you so much for the detailed report of your approach. I was inquiring how were you able to access the h100 gpus or rather which platform cause personally that was my hinderance.

i now try to use your code and solution to do an ablation study.
The issue with drivendata is that i cannot do a post submission, so it is difficult to judge if i have run your code correctly.

if it doesn’t take too much time and effort to prepare, i would to see some local validation results (e.g. list of train/validation image set and the validation dice). This will enable me to check if it have run the code correctly.

Thanks a lot!

by the way, i tried some of your “what we haven’t tried” of your technical report:

  1. upsampling of satellite images
    not much effect (note that i use unet for resnet and vit. i use additional convolution encoder layers so that the encoder output scale is 1, 1/2, 1/8, 1/16, 1/32)

  2. tunning threshold for submit
    once you correct the misalignment in validation, the threshold is quite flat over 0.3,0.4,0.5. The peak is about the same for public and private test data. (i.e. public test is close to local CV. private test is higher)

There are some mislabel images (e.g. fp annotation where clearly there is no visible kelp). Including this in the validation set lowers your local cv score but doesn’t not change the peak, i.e optimal threshold does not change.

1 Like

That’s true that there are lots of misaligned or “underlabeled” images in the dataset yet fixing them could only make things worse as it was an easy walk for the models to predict green kelps, it took no more then 1 epoch, yet it took more to predict what people actually labelled. I once was surprised by very small and unnatural oof prediction for some obvious kelp region yet later I found the same region present in several train images with the same small masks by annotators so the model actually trained to underdeliver based on human performance.

We have also noticed quite a few poorly labeled images. We tried removing 15 images that we decided were poorly labelled (based on errors of models) and it improved our local dice scores by around 0.01 for all models. However these models performed worse on the public LB and after seeing a post that said that the test set had the same label quality as the training data we decided to keep these images


We just went back and tried some ablation as well. And surprisingly, in hindsight, the XGboost channel might have actually held back our latest models. We had been using it from the start, as early in the competition it seemed to make the simpler models more stable. The plot below is for a single 20% validation fold, but the jumps seem quite big. There is thus a change that our submissions could have gotten significantly higher scores if we re-trained our best ensemble without this feature.


“However these models performed worse on the public LB”
there are two types of error in the dataset

  1. wrong label (fp, miss etc)
  2. miss aligned annotation

do not remove wrong label image because that could changes the distribution. Hence the results may be better or worse.

But, for the mis-alignment, it definitely improves results!
as an example (in one one dimension)
train data:
image = [0,0,0,x,0,0,0],
annotation = [0,1,0,0,0,0,0]

corrected image

this happens in the test image also. we need to align everything to the DEM.

example of aligned test image i used
(i build a predictor to estimate land mask from rgb input. then automatically align the predicted mask+rgb to DEM)

a good train image

a misaligned train image

NIR shows super-imposed image of nir+DEM
the last one at the bottom right is super-imposed image of nir+DEM+kelp annotation

How was the XGBoost channel created, in terms of train/validation splits? Maybe you guys did n-fold CV with the XGB model only, and used this to generate the out-of-fold XGBoost predictions on all the training data?

@hengcherkeng, I understand the idea of the images being misaligned, however I cannot visualize it or understand it from your examples. None of those images appear to be misaligned to me (visually). I never saw any image in the training data that I believed was misaligned. Are you saying that the channels were swapped? Meaning channel #4 (green) was swapped with channel #2 (NIR)?

He means that there is a geographical offset between the DEM/kelp and the spectral layers.

It’s a bit hard to see from the images posted, as the offset is on the order of 10’s of pixels.
It’s easier to see when these are actually overlaid on top of each other.

But in the bad image posted above, have a look at the bottom right corner of the DEM and the RGB image. The DEM would have you believe there is already water there, but the RGB image shows there is still land. And inversely for the top right corner.

The kelp seems to have the same offset as the DEM.

So, how this misalignment problem is solved ?

Can you explain a little bit more how did you solve the misalignment problem here? I am not able to understand what you did here.