Is the smoke test evaluated on the audio files that are available for download from the competition website? I was only able to find a little over 2,000 audio files.
I used those files together with the provided score.py script to calculate the WER locally, but the result is quite different from the WER shown on the website. Is this expected, or could it be that I downloaded the wrong dataset?
The smoke test is evaluated on audio files from the training data. The training data are comprised of two corpora - one is hosted by DrivenData, and another is hosted on TalkBank. Instructions for accessing the TalkBank data are available on the data download page. You’ll need both datasets to reconstruct the smoke test data.