Bad masks (ahfi, cutk, iboy ...)

MPWARE · December 12, 2021, 5:26pm

Hi,

I’m just joining the competition, after a quick sanity check of data, It looks like I’ve found out some bad masks. For example, look at chip_id = iboy, mask is zero (no cloud) but the true color image has clouds. Same for cutk, hxhj, ahfi, mpbf.
I’ve downloaded all the labels twice to make sure it was not a download issue.

Do I miss something or we’ve bad masks?

dfulu · December 12, 2021, 5:48pm

In some of my output during training I’ve noticed that there are more bad masks as well. I’d guess a couple of percent are wrong.

eg.

Would be really good if one of the organisers could comment on how clean the test data for the leaderboard is?

MPWARE · December 12, 2021, 6:13pm

Some are inverted like this one:

Some looks very bad as the one you’ve spotted:

leigh.plt · December 13, 2021, 1:03pm

5-10% of masks are bad. It’s my opinion after short tests

rbgb · December 13, 2021, 8:08pm

Thanks for bringing that up. In preparing the dataset, we definitely saw examples of noisy labels. Mostly we saw examples of noise like @MPWARE shared in the first post – chips with scattered clouds that were labeled as all 0 or 1. @dfulu the example you shared is interesting; could you share the chip ID for that one so we could look into it?

dfulu · December 14, 2021, 12:12am

Hi @rbgb, I don’t have the ID of that one handy. It was a random example I logged during training.

Here’s a few I do examples I do have

dfulu · December 14, 2021, 5:55pm

Here’s a bonus example (don’t know the ID)

Capture

MPWARE · December 23, 2021, 4:05pm

I would also say 5% of bad masks.
Another example like dfulu:

@rbgb Can we consider the same for both public and private test datasets?

rbgb · December 24, 2021, 8:45pm

@MPWARE Yes, you can assume a similar distribution of bad masks in the test set.

Sentinel-1 · January 21, 2022, 12:49pm

I also found afyr by accident, it seems you have already discovered this one. Thanks for sharing your observations.

Topic		Replies	Views
The process of qualitative evaluation On Cloud N	1	343	February 8, 2022
Data Quality Issues? Mapping Disaster Risk from Aerial Imagery	3	820	December 14, 2019
Submissions after competition close? (non competitive) On Cloud N	3	389	February 8, 2022
No discussion for this competition? Mapping Disaster Risk from Aerial Imagery	7	651	December 16, 2019
Nice job everyone! Hateful Memes	8	624	November 3, 2020

Bad masks (ahfi, cutk, iboy ...)

Related topics