This is the first time we worked with such a big dataset and many pre-trained models in a competition. It seems to me that Google Colab (free version) is not sufficient to power us through the competition, since training a model takes >30 hours and we have a limited disk memory.
Currently, we are thinking about Google Colab Pro or AWS SageMaker. Could you please recommend some services that are promising for this kind of competition? Thank you very much for your help
I am working with Colab (non-pro version) and have been able to use it very easily by limiting epochs and batch sizes. You can checkpoint your models and retrain if your time runs out.
Thank you very much for sharing! I will dig into the mmf package and see which variable I can change to save more space
MMF creator here. You can increase
training.checkpoint_interval so that you that you don’t checkpoint at regular interval of 1000. We will soon provide an option that will only save last n models.