It seems smoke test allows users to see execution time, and memory in their codes if they added some log function in their main.py, but it is not feasible for the normal test. It disabled all the logs.
Is there any way we can see how much VRAM occupied at the maximum during our submission. We found for the same codes which run well without OOM on Colab T4 but OOM on the cluster here.
@cuongk14 Good question. Unfortunately there is not. All logs are disabled on the full test to protect data confidentiality. I recommend finding ways to get the information you need either using the logs of a smoke test submission or with local testing of a submission.
The smoke test submission does run on the same node type as the full test submission, so should still give useful information.