Hi @rasyidstat,
Thanks for your patience while the challenge organizers have been reviewing your questions.
I am afraid that for late fix, there will be more competitive advantage since it uses new data being downloaded.
Challenge organizers anticipate that the advantage gained from occasional fixes possibly having access to more recent data will be relatively small and limited. We will be monitoring job failures and fixes, and teams that are deliberately failing jobs in order to run submissions later to access more up-to-date data may be subject to disqualification.
- For what severity, will we return the error? download issue, format issue, missing data issue?
In general, we are expecting that submissions should not deliberately fail for any reason. We expect that by default, your submission should run automatically without any manual intervention. Fixes are being allowed to address unforeseen runtime errors.
- If there’s no error, can we update our submission code to change preprocessing logic and handle missing data or any data issue? And also request to rerun the issue date with known data issue? Imagine a scenario where there is an error in sensor of weather site, the value is very abnormal, let say become 9999. Our code runs fine. But we want to fix it as the data input does not make sense and rerun the issue date. Can we do that?
For data issues that do not result in any runtime error, we are generally not permitting you to update your code. You may want to consider designing your solution defensively to handle such cases.
In exceptional cases, challenge organizers may consider interventions, such as rerunning submissions after a data source fixes a data issue or excluding an issue date. These situations will be evaluated on a case-by-case basis while considering the impact on the fairness and quality of the competition. In general, active interventions will only be made for exceptional and extreme circumstances, and you should generally not expect that they will happen.
-
At which exact time mounted volume download code run and inference code run?
-
It’s stated that inference code can run later than issue date. Is there any max time limit between, e.g. no later than 4 days of issue date? And does it mean that download code also will run at later date?
We are not guaranteeing any exact time for when the mounted volume data is downloaded, or when admin-scheduled jobs occur. In general, we expect they will happen some time during the day of the issue day, U.S. time.
- Is it permitted to download the data by ourselves if the data is already available in mounted volume? Should we always use mounted volume data whenever possible?
We discourage you from downloading data that is already available in the mounted volume redundantly, and we encourage you to use the mounted volume data whenever possible. This will improve consistency between runs, no matter when they happen. It will reduce the likelihood of a failure from a data issue, as well as reducing the likelihood that you unintentionally do anything that is not permitted.
- Is there a chance that a specific issue date will be not evaluated for scoring? Why this can happen?
Currently, only 2024-01-01 and 2024-01-08, which are trial issue dates during the open submission period, will be excluded from your score. However, challenge organizers reserve the possibility that other issue dates may be excluded as a result of unforeseen circumstances. This will be evaluated on a case-by-case basis and will consider the impact on the fairness and quality of the competition. This would be an exceptional situation, and you should assume by default that it won’t happen.