two questions about the use of the Hateful Memes dataset

I have two questions about the use of the Hateful Memes dataset:

  1. Can I construct a dataset based on the characteristics of the Hateful Memes dataset and use it for model training?
  2. Can I delete the noise samples in the training set and use the cleaned data for training?

You can do anything with the dataset to train your model but use of external data might not be permitted.

What do you base that on? It’s not accurate according to this post:

2 Likes

Hi @arch! For (1), it’s not exactly clear what this would entail, but as a reminder derivative works aren’t allowed from the dataset. You can refer to the data license terms for more information. On (2) pruning the samples used for training is totally fine. Hope this helps!

1 Like