Languages, tasks and validity of audio features

Hi, We were trying to convert the audio into language transcripts and we found out that there are several different languages. Are all the pre-generated acoustic features been validated in all the languages included here?

Also are the tasks in the training audio files just cookie theft or are there are a list of different tasks the participants are doing?

Thanks for writing in! We don’t have additional information to share about the features beyond what we have in the problem description, and similarly regarding the tasks participants were completing.

We also noticed that there are several languages in this dataset, is it possible to train and test only on English language data(both linguistic and acoustic) and submit those predictions?

Hello! Yes, you can explore any modeling strategy that is reproducible and adheres to the challenge rules.