I’m getting the following error but no further details details. If I had to guess it is when it is doing the predict_generator as I do not get the logged message saying inference is complete. Any ideas/tips to debug the issue? @bull
/inference/entrypoint.sh: line 49: 20 Killed python main.py
Is that the the entire log for the submission? Does the submission run for you in a local Docker container with the scripts in the repo? How big is the entire submission once unzipped?
It’s possible you tried to allocate too much disk space or memory.
Which of your submissions is this?
No, but I can put the whole part here if it’s easier for you.
I’m not sure how I should do that given that I do not have files to test it with. What I’ve done is generate a file similar to your demo test_metadata.csv but using files from season 10. The structure of the file is the same. I could test using the docker, so far I have run it in my py-gpu environment created using your conda yml files.
My last submission was at “2020-01-15 21:08:22 UTC”, not sure I can identify in any other way
I adjusted the number of workers and batch size and it seems to run inference until the end now! Just got stuck on a glitch generating the csv now, hopefully I can figure out the bug