Now that we are done, who wants to talk about what worked?

Are we allowed to do this?

Did we even need to use noise-augmentation on training data for the phonetic task? I suspect that the Talkbank+Drivendata dataset distribution were similar to the pvt set given how close my LB scores were to clean-val-cer.

What about the word track? For me CV LB never matched.

Never bothered because training was expensive.

I did the phoentic track and had an eval set that was 50-50 talkbank and DD. Ended up being accurate to blind set by 0.01 CER

0.01 as in 1% or 0.01% CER

Used any noise augmentation? I couldn’t decide if augmentation and to what degree was even needed for this track.

0.01 CER my bad. Tried all kinds of augmentation and nothing really worked