Back to DrivenData | Blog

Data Quality Issues?

Does anyone else notice that there are overlap between the classes? For example, there are images pure tree pixels in both the other and healthy metal classes.

2 Likes

yes there are many instances of labels interchanged between classes. Though clearing wrongly labelled samples and train it affects LB a lot. I’m afraid the test set has same issues too.

1 Like

Can you explain “it affects LB a lot”?
Correcting wrongly labelled samples, increases leaderboard rank or decreases?

IME it increases LB loss after training cleaned data. Trust your CV and pray that Private LB should be labelled as same as released training data.