Issue with training data points

Overlaying the training data points on Sentinel Mosaic shows some weird patterns of straight lines (that are very close by) that don’t cover any water body.
Is this an issue with the data collection process? I see a number of instances of similar patterns which might effect the training scores.

Points are color coded based on the severity scores.
ezgif.com-gif-maker

Hi @srmsoumya! That’s a great question.

It is possible that there is noise in the underlying data. The data was collected from a large number of public health and water quality managers in the real world, and may be subject to human error. However, these cases are rare and there are no underlying systematic issues with the competition data.

This mirrors most real-world problems where data is not perfect :slight_smile: It is a good note and worth considering if there is a better way to identify and handle these cases.

Good luck!

1 Like