the train_labels document(766) contains 102 zeros that do not represent any labels. I noticed when I changed all the lines to their decimal numbers. these zeros can represent the 11th label. How should we interpret these zeros?
I have attached the image
Hi @onurk83, welcome to the challenge.
Am I correct in understanding your question to be about samples in
train_labels.csv which have
0 for all ten label classes?
If so, then this is expected and correct. As stated in the problem description page:
Each sample can have any number of class assignments. A
1 indicates that a compound from that family is present in the sample, and a
0 indicates otherwise.
0 means that the sample’s composition does not include components that belong to any of the ten label class chemical families. Whether or not you want to treat it as an eleventh label class is a modeling strategy choice that is up to you.
Thank you for explanation. I understood.
I am very happy to get the opportunity to work on this wonderful project.