I wonder how the evaluation metric code at the test server handles outlier noises in lidar ground truth?
The least-square loss is sensitive to outliers. I think spike noise (very high values > 400) is common in lidar images, see below:
Besides spike noise, there are also other label errors like missing regions. I wonder if these are used in the computation of the leaderboard metrics?
Thank you for accurately pointing out some messiness in the data. It is true that some 0 labels indicate a true lack of biomass, while others represent missing data. Unfortunately there is no way to distinguish between the two in the labels, and they are treated equally in the metric.
This is a cool observation. can you please point me to any chip ID having such a missing (assumed) lidar value?
train a model and then sort validation samples from high to low rmse errors. inspect those samples with exceptional high rmse erros