Pre-trained models and external data

rosgori · September 2, 2018, 4:06am

Are pre-trained models and external data allowed?

Well, about the external data:

Unless otherwise expressly stated on the Competition Website, Participants must not use data other than the Data to develop and test their models and Submissions.

But… pre-trained models use external data, so…

bull · September 4, 2018, 1:59pm

It is ok to use pretrained models as long as the model and the weights can be released under an open source license. We have added this note to the problem description.

ironbar · September 9, 2018, 7:04am

Do we have to publish which external model or data are we using? Or only on the case of winning?

bull · September 11, 2018, 5:46pm

Per the problem description as long as the model and weights that can be released under and open source license, you do not need to share which pretrained models you are using ahead of time.

itdxer · September 12, 2018, 7:00pm

I feel like that allowing pre-trained models and disallowing external data might be a bit tricky. I assume that somebody can get any dataset (available for commercial use) and heavily overfit some neural network, then release it under open license (in some very hidden place that doesn’t even available for search indexing or release it one day before the deadline). After that, it’s possible to use any dataset that “wrapped” as a pre-trained model.

Not sure what rule can be added to ensure that this type of models won’t be used.

c3josh · September 13, 2018, 8:34am

@itdxer - maybe that is OK in the eye of the hosts. Ultimately, their aim to have the best possible model for the task requested, what better way than potentially training a model on some closed source industry data. As a bonus, ML hobbyist and energy enthusiasts gain access to an open-source pre-trained SOTA model.

I guess, on the other hand, it might seem unfair to those who do not have access to this data. However, there is plenty of open source data available to train it on https://toolbox.google.com/datasetsearch/search?query=energy&docid=PJh7N6zoe9YcbhqhAAAAAA%3D%3D

Topic		Replies	Views
Are pretrained models allowed? Kelp Wanted: Segmenting Kelp Forests	2	395	December 20, 2023
Are pre-trained models allowed? Pri-matrix Factorization	5	1714	December 20, 2017
Clarification on "No External Data" Rule and Computing Resources AIAI Challenge	3	137	July 21, 2025
Using Pre-trained models Overhead Geopose Challenge	1	544	June 10, 2021
External Data Clarification Power Laws	4	890	February 20, 2018

Pre-trained models and external data

Related topics