As stated in the rules, prize-eligible submissions may not use:
software that is not licensed under an Open Source Initiative approved license; or
an open source license that prohibits commercial use
Common OSI licenses are MIT and Apache 2.0. You can find a full list here.
Training on MIMIC III dataset is a slightly different question as it’s about data rather than models. Training on MIMIC III and IV is allowed in this competition. This is now specified on the problem description page.
Regarding ada embeddings, you are not allowed to pass the clinical notes into OpenAI models (since they are not open source) and because this is a violation of the data terms from PhysioNet. Remember that the clinical notes are extremely sensitive and cannot be shared. If you need embeddings for RAG techniques, you could use a model like FlagEmbedding.