Me Ranting over this Competition

oknaitik · March 2, 2026, 1:44pm

Fine-tuning nemo-asr models in this competition is a nightmare.

Compute is a joke: 16GB on a P100 is barely enough to breathe. I frequently get into OOM crashes mid-run. It’s hard to iterate fast and climb LB when you’re just babysitting memory usage.
Dependency Hell: I wanted to use the native NeMo Hydra configs and PEFT frameworks, but Kaggle’s Python version makes it a total mess. For a good amount of time, I was stuck debugging environment conflicts, only to move to HF-style code.
The Validation Gap: My val_wer looks great, but the test set feels like a completely different distribution. Seeing the delta feels like I’m shooting in the dark and feel like giving up.

<rant_over>

Phaedrus · March 3, 2026, 4:47am

Quite true, LB seems to come from an entirely different distribution. I’ve tried (and still am) multiple local validation strategies but the delta remains, and even the trend does not correlate most times.

oknaitik · March 3, 2026, 6:32am

Thanks for sharing your thoughts!

gezi · March 3, 2026, 10:30am

Well I got LB and local align for phonetic track but for word my model finetuned on train data was much worse then offical baseline…

Phaedrus · March 3, 2026, 12:27pm

If Gezi cannot get his CV-LB aligned, there’s no hope for the rest of us !

cszc · March 3, 2026, 2:47pm

Hi @oknaitik - We totally hear your frustrations - and thank you for being such an active and helpful presence in the forums!

A quick reminder to everyone: if you’re using external infrastructure, please make sure it complies with the competition rules. Competition data must remain private and encrypted, and you may not use services that retain or train on the data. It is the participants responsibility to ensure the data is treated in accordance with the rules.

On the modeling side, NeMo fine-tuning can definitely be finicky. We recently shared a reference solution that you may find useful.

And re: the validation gap - as noted in the problem description, the test set includes out-of-sample data. We encourage solvers to focus on approaches that generalize well across speakers, recording conditions, and speech types.

oknaitik · March 16, 2026, 5:13pm

Save&CommitAll long queues, some reporting sub-50s, have become a prominent issue and are being frequently reported on Kaggle’s forum. I’m literally waiting 1-2 hours for every run.

There’s another global crisis going on - P100 compute.

Topic		Replies	Views
New Tutorial: Finetuning Parakeet with NeMo Children’s Speech Recognition Challenge	0	152	March 3, 2026
Now that we are done, who wants to talk about what worked? Children’s Speech Recognition Challenge	10	200	April 14, 2026
Use of Adult Speech Data for Pretraining and Fine-Tuning Children’s Speech Recognition Challenge	3	162	March 23, 2026
Tools for this Competition? Hateful Memes	3	1002	May 25, 2020
Qwen_asr is not available Children’s Speech Recognition Challenge	3	266	February 20, 2026

Me Ranting over this Competition

Related topics