Question on Specific Image Classification

Looking through images a bit and I noticed several images that seem (potentially) mis-classified. In reviewing the paper that discussed how hate speech is determined there is the following statement:

“That is, creators and distributors of hateful memes aim to incite hatred, enmity or humiliation of groups of people based on the grounds of gender, nationality, etc. and if that group falls in one of the protected categories, it constitutes hate speech under our definition”

Images 90471, 20568 and 84269 all seem to be labeled as “non hateful” in the training data set, I was wondering if that could be confirmed by someone knowledgeable about the creation of the data set.

They are definitely misclassified but I would wait for a word from the organizers.

I’m not sure if others have found any discrepancies but maybe it would be good to start recording them in one place here for the organizers if you have. I have found multiple more that I think need to be considered as mis-classifications.


Also found some such examples in the dev split.

Hi @GregKuhlmann - It’s not permitted to post the data outside of the competition. If you’d like to ask about examples you may reference image IDs without the text.

So is there any feedback on the examples provided?

To follow up on the examples, thanks for bringing these to our attention. As with any dataset collected using human annotators, it is definitely possible there are misclassifications in the dataset. Since the competition has already started, we will not change the training data. It is up to you to decide how to work with the data to create the best solution.

Apologies. The external data post has been removed.

Here is a list of the IDs I believe deserve another review: 01258, 01598, 01823, 02358, 02471, 02475, 02519, 02647, 02793, 03528, 03794, 04356, 04926, 05164, 05261, 05329, 05421, 05479, 06194, 06319, 06534, 06579, 06714, 06985, 07523, 08126, 08439, 08546, 08719, 09156, 09364, 09547, 09587, 10259, 10583, 10732, 10943, 12483, 12957, 13486, 14695, 14830, 15493, 16280, 17235, 19430, 23615, 23957, 27056, 27134, 27685, 28405, 30489, 31429, 32470, 32490, 34586, 34985, 35402, 36048, 36081, 37265, 37601, 42065, 43521, 43826, 46712, 54891, 56249, 56423, 59678, 61204, 72936, 83150, 92405, 92408, 92413, 93057

Thank you for pointing these out, @lkollmorgen and @GregKuhlmann! We had a look at each of the examples flagged in this thread, and you are right that some of them have unfortunately been misclassified. There are a couple of points we wanted to make here to avoid any confusion:

  1. Hate speech is a very difficult problem. One reason for introducing the dataset and challenge is because we wanted to show how difficult this problem is, and to try to get the AI community to make progress on this important problem. As we all know as data science practitioners, it is normal for training data to be noisy. It would be great if we could have caught these and other examples before the competition started, but at this point in time, we should treat the dataset as a fixed, unfortunately imperfect, sample. In the dev and seen test set, misclassifications, although possible, should be very rare, however. We will take extra care with the to-be-announced unseen test set.

  2. We’re pretty clear about this in the paper, but I just want to make sure this is clear for everyone: the hate speech definition we employ in the paper is NOT the official Facebook policy, and the annotators we used are NOT Facebook annotators. This is on purpose: we want the field to be able to work on important problems like this without having to get into a discussion about the broader policy questions, and to focus on the scientific question instead. Those policy questions are obviously super important, but not what this work is about. In other words, this an open scientific challenge, and we’re not asking the community to solve Facebook’s platform or policy challenges.