About the Runway Functions category

Aha, I understand now. No, it isn’t necessary to add your professor to your team. Sharing code with your professor for grading is allowed. The bit about sharing code is intended to prevent sharing among competitors.

Is it possible to see the full content of the logs in the submission to the prescreened arena? The DrivenData pane seems to cut the logs at 1000 lines and for the rest it yields the following: < … WARNING: logs capped at 1,000 lines; dropping 1,365 more >. It would be great to have access to the entire log file in order to see where it fails, it seems to be a time/memory issue since after different executions with the same code it fails at different timestamps and at about the 4h mark of execution saying: Your submission did not output the expected file so it could not be scored. Thanks!

1 Like

@alsaco Done! I increased the max number of log lines to 5,000 so you should be able to see those later messages now.

1 Like

Hello, I have a question about using data from the training set in the prescreened arena. Are we allowed to create csv files from the training data, include them in our submission.zip, and read from them during code execution? Thank you!

Thanks, much appreciated. I just wanted to double check that the runtime limit was 8h as specified in one of the comments above. After executing the code again and retrieving all the logs it failed at the 2h mark without any error being registered in the log file. Thanks again!

I don’t think there is any problem with that. If you plan on using that as your final submission, would you mind sending a little more detail about your strategy via email just to be sure (robert@drivendata.org)? Thanks!

1 Like

My apologies, you are absolutely correct. That time limit was incorrectly specified in the cluster. Please try again, and your submission should get the full 8 hours.

Thanks for the change, it ran perfectly now. Could you confirm whether we will also get the 8h/week limit during the blind test period? Is there any uncertainty in the cluster’s capacity we should account for?, meaning that the same code has very different execution times based on network flow / time of day… . Thanks again!

Yes, you can count on the same 8h/week limit in the final evaluation period. We will use the same cluster as well.

I hear your point about compute requirements varying based on specifics about the features that are impossible to know definitively ahead of time. One important difference is that the final evaluation period is roughly 1 month, whereas the prescreened test set is 1 week. Do your best to assure that your solution won’t exceed the cluster capacity on the final evaluation set. We will try to be flexible if, for example, the final evaluation features are vastly more compute intensive than the prescreened test features.

Hi! I just wanted to double check that the log-loss score published in the prescreened arena is not weighted in some way (such as weighing the earlier lookahead more than the later ones from 30 to 360, or to a specific airport, etc). Prescreen is testing this over a whole week in prescreen arena and I find it a bit suspicious that the score I have currently on the leaderboard seems just a tad lower than any validation score I have gotten haha. Probably just some variance in the data and my algo but I just wanted to double check. Thanks!

Can we assume that air traffic data will be sorted by timestamp during the prescreened testing period? Thanks!

I just wanted to double check that the log-loss score published in the prescreened arena is not weighted in some way (such as weighing the earlier lookahead more than the later ones from 30 to 360, or to a specific airport, etc)

Correct the open, prescreened, and final evaluation datasets all use the same loss metric. I would expect some deviation in scores between datasets due to a variety of factors. All of that is part of the challenge!

Can we assume that air traffic data will be sorted by timestamp during the prescreened testing period? Thanks!

Yes for the prescreened and final evaluation periods all of the features will be sorted by timestamp.

1 Like

I just wanted to validate my understanding regarding point (1). If I understand correctly, some airports might be dropped from the evaluation and hence will not appear in the submission format csv but we will have data for them. Is it safe to assume that there will be data for all the airports available? i.e. past configurations, past weathers…

That’s right, we might say “you don’t need to predict for airport ABC since the data quality is poor” (i.e., ABC is not in submission format) but we will still include airport ABC features (past configurations, past weather, etc.).

Was there any noise added to the data in the Open Arena?

Nope, we did not add any noise.

1 Like

With new local build, I think there is an issue with typer and the latest click=8.1.0 (Add click 8.1.0 support by madkinsz · Pull Request #375 · tiangolo/typer · GitHub). In the next few days I think I might make a pull request to request just some data analysis packages into “environment-cpu.yml” in the repo. I was wondering if it would be more appropriate to update typer=0.4.1 (Add click 8.1.0 support by madkinsz · Pull Request #375 · tiangolo/typer · GitHub) or “lock” the click to a previous version? Thanks!

Runtime updates are a bit of a balancing act: we try to change as little as possible to avoid breaking code that previously ran, but we also don’t want there to be too many hurdles to development as a result of using old code versions. That said, I’m not too worried about a minor version update to typer (0.4.0 to 0.4.1 ) causing any problems, so I’d say go for it. Just a note: we like to keep the CPU and GPU environments as similar as possible, so try to make the same updates to environment-gpu.yml in your PR. Thanks!

Unfortunately I don’t have a GPU to test that enviroment (although I don’t particularly see a reason why it would fail). :’’)

Ah, that’s not a problem! You can make those changes to environment-gpu.yml anyway. When you make the PR, GitHub Actions will build the GPU image and run some automated tests to make sure it works :sparkles: :

1 Like