Hello. During the contest, we created a large children’s speech dataset using voice cloning techniques and external text datasets. Since our data is based on the voices from the contest dataset, and the rules forbid sharing the original contest data, we wanted to clarify a point. Because our generated dataset isn’t strictly the original contest data, are we allowed to publish it?
The competition data was approved strictly for use in the challenge for the duration of the challenge period. For uses beyond the challenge, you will need to follow up with the original data providers. We are putting together a list of providers that we will share on the About page of each track.
Thank you for the clarification. Does this restriction also apply to sharing our trained models, given that their weights inherently encode the original contest data?
Please see the competition rules for terms relevant to sharing solutions, including the following:
Participants are permitted to publicly share source or executable code developed in connection with or based upon the Data, or otherwise relevant to the Competition, provided that such sharing does not violate the intellectual property rights of any third party. By so sharing, the sharing Participant is thereby deemed to have licensed the shared code under the MIT License (an open source software license commonly described at The MIT License - Open Source Initiative).