Clarification on "No External Data" Rule and Computing Resources

Our team has started developing ML pipelines for the competition and would appreciate clarification on a couple of rule interpretations:

1. Pre-trained models and “external data”: Are pre-trained models (e.g., ImageNet-trained CNNs, CLIP, pre-trained transformers) considered “external data” under the competition rules? These are standard practices in modern ML, but are technically trained on external datasets not provided by the competition.

2. Open source model restrictions: Are there any restrictions on using large open source multimodal models (like Llama, Qwen, etc.)? While these meet the open source license requirement, they could require substantial compute resources and might create advantages based on hardware access rather than methodology.

Would appreciate any guidance on these points to ensure we’re all interpreting the rules consistently!