Rules clarifications

stefan.istrate · July 1, 2022, 8:54am

Dear organizers,

I’m trying to understand a couple of rules about this competition that has just finished. I think it’s important if you can clarify them now, so that participants in your future competitions will be assured they will play in a fair environment.

My first question is about your requirement on sharing external datasets and pre-trained models on the forums.

If you are using any external datasets or pre-trained models, you are required to publicly share about them in the competition discussion forum. If you have any questions, please ask on the forum.

What is the point of this rule, especially since everyone announced what models they used in the final few days of the competition? Clearly, these late announcements didn’t give anyone else any chance to incorporate those pre-trained models in their submissions. If the disclosures are made after the solutions were submitted and not before, the rule doesn’t have any effect; we would find out about the winning solutions anyway from your blogposts.

A message from @mike-dd seems to point at the fact that ineligibility is defined per-submission, and not per-competitor.

We’ll be removing ineligible submissions at the end of the competition, and will follow up then. Any submissions that are not disqualified may still be considered for the private leaderboard.

Technically I would be allowed to use whatever techniques I want, within or outside the rules of the competition, as long as my last submissions are within the rules. Is this the case? What happens with the knowledge that someone accumulates about the data while not playing by the rules? How is that fair to the other competitors?

I know these are hard questions about the fairness of a competition, but it will benefit no one if they remain in a gray area. Your clarifications would be appreciated.

Thanks,
Ștefan

Ammarali32 · July 1, 2022, 10:07am

Thanks that you pointed that out:
For 1.“sharing external datasets and pre-trained models on the forums.” I totally disagree with this rule Actually if we shared the models at the beginning then the solutions will be similar and differ only in terms of post processing and tricks which is not so good. The models are available and it depends on you effort and experience to choose the models. (What is the idea if some one spend 2 -3 weeks trying to find the best models then just share it with some one who just started the competition which is completely unfair) and with a little amount of reading and searching you will be able to find your own models.
2. I think the organizers should be able to handle this problem properly (but it is also unfair if you just send a submission that is not eligible because you miss a rule especially if the rule is not written in the rules section). If someone used inelligable sub to get info about the testing data I agree he/she should be disqualified. But if you mean about me I will share my submission soon on github and you can see that I didn’t use cache files or anything that could give any useful information about the test data. That is why I think you cannot put a fixed rule for this case because in both it will be unfair but the reviewers and organizers will be able to check and accept or disqualify according to the particular situation

stefan.istrate · July 1, 2022, 4:22pm

@Ammarali32, don’t take my message personally. My intention is to get the organizers work on their rules, so it’s a level playing field for everyone. I thought about the gray areas of the rules myself in the early days of the competition and worried that bad actors could still get away with illegal tricks that no ecologist would benefit from. Do the organizers verify ALL submissions from the winners?

Also, @Ammarali32, congratulations on your top submission! I’m curious to see what worked best.

jayqi · July 1, 2022, 5:38pm

Hi @stefan.istrate and @Ammarali32,

Regarding the requirement to share external datasets and pre-trained models, this has been a standard part of our rules for many past competitions that allow external data, but we agree that it does not effectively achieve its intent. We plan to reevaluate the details of this rule for future competitions. With that said, hopefully the actual practice of sharing this information in the final days of the competition has provided interesting insights to all competitors, while not undermining the efforts of those sharing the info.
In general, as detailed in the rules, DrivenData reserves the right to disqualify any submissions or any participant that are deemed to be in violation of competition rules.
- If a participant has individual submissions which are not compliant with the rules but are otherwise made in good faith, we will disqualify those noncompliant submissions but accept compliant submissions.
- Any participants who are found to be abusing submissions or gaining an unfair advantage will be disqualified from the competition, and, depending on the severity, may have their DrivenData account banned.

Please rest assured that the organizers conduct a thorough review of the winners’ submissions and we don’t hesitate to disqualify a participant in cases where they are clearly violating the rules to gain an unfair advantage.

We definitely hear your thoughts. Thank you for for participating in this competition and for providing questions and feedback on the forum.

Ammarali32 · July 3, 2022, 12:48pm

@jayqi Thanks for the detailed respond. I am just wondering is it ok to make my solution repo public now

Topic		Replies	Views
Congratulations to the winners Where's Whale-do?	2	334	July 1, 2022
Can you consider only my last submission Where's Whale-do?	2	444	June 29, 2022
Pre-trained model disclosure Where's Whale-do?	0	272	June 23, 2022
Submissions after competition ends? Pri-matrix Factorization	2	669	December 29, 2017
External Data Clarification Power Laws	4	858	February 20, 2018

Rules clarifications

Related topics