How to do pre-processing with spaCy or NLTK if they require external download?

tomaz-suller · November 11, 2024, 3:59pm

Hi there everyone. My team is using spaCy for pre-processing the narratives before modelling, and we don’t really know how to make it work given we can’t download the pre-trained models on the fly in the competition environment. I’ve made a PR to the runtime repo to add the model we use, but I’m not sure it’ll be analysed on time.

Is there a way around that? Has anyone done pre-processing using these libraries but without needing any external dependencies?

Thanks!

I honestly should’ve asked this earlier, but we focused on developing the solution a bit too much (":

tomaz-suller · November 11, 2024, 4:18pm

I think I might have found a solution to my own problem haha. Will try this out later, but if anyone’s confirmed a way around it I’d love to heard about it still!

Topic		Replies	Views
Official pre-trained models/external data thread Clog Loss: Advance Alzheimer’s Research	3	1289	August 3, 2020
Official External Data Thread N+1 Fish, N+2 Fish	14	1879	October 29, 2017
Official pre-trained models / external data thread Segmenting Buildings for Disaster Resilience	15	2922	March 2, 2020
Pre-trained models and external data Cold Start Energy Forecasting	5	1196	September 13, 2018
External data \| Pre-train models Where's Whale-do?	9	681	July 11, 2022

How to do pre-processing with spaCy or NLTK if they require external download?

Related topics