Dear Conser-visionists,
ML / Deep Learning newbee here I’ve tried several approaches but couldn’t get any improvements over the benchmark results.
To better understand what’s going on, I used the great benchmark script and adapted the fraction of training images and epochs of training to let the model train on different numbers of images.
Below, I’ve plotted the training loss as well as the public loss for some configurations - Training loss goes down as expected, but public loss is increasing.
So it seems that while the model is learning to distinguish training data incl. unseen validation data in this set, there is no correlation to the test data.
Any thoughts on this behaviour and how the model can actually learn the “right” things?
Best regards,
David