@ajijohn one recommendation is to use GPU rather than CPU. Another participant posted a few helpful pointers for code modification: IMPORTANT: How to speed up Inference & Queue time
I believe you’ll also have to make sure the google colab runtime allows GPU when you start it.