Hi Solvers,
We just published another reference implementation! This tutorial for the Children’s Speech Recognition Challenge: Phonetic Track.
In this in-depth blog post and companion repo, we walk through how to fine-tune Wav2Vec2 using the Hugging Face Transformers library. The solution results in a 0.3460 IPA-CER on the public leaderboard.
In the tutorial, we:
- Demonstrate how to load and explore the data.
- Provide a basic framework for building a model.
- Walk through how to package your work correctly for submission.
Whether you’re just getting started or looking to benchmark your approach, this should give you a strong foundation to build on. Read the full post here: Competition: On Top of Pasketti: Children’s Speech Recognition Challenge - Phonetic Track
And if you haven’t already, you can check out the Word Track tutorial here: https://drivendata.co/blog/child-asr-word-benchmark
Happy modeling!