Probing & etc. —

Competitors are known to probe. Is this legal?

(E.g. printing the date distribution, % of images overlapping with train, etc.)

Related: is it fully legal to upload precomputed cache files for train images?

Hi @_NQ,

Thanks for the question. Printing of data to the logs is grounds for disqualification, per this section of the competition website:

Your code may not read and inspect the files in /data directly. Doing so is grounds for disqualification. Instead, you will implement a script that passes the data into your model for inference. Using I/O or global variables to pass information between calls, printing out data to the logs, or other attempts to circumvent the setup of this prediction challenge are grounds for disqualification.

To your second question, yes, you may upload precomputed files for the train images.

Good luck!

There’s a very high probability teams have been implicitly probing: a) fraction of new whales and new images, as the latter is legal to probe based on inference time, and b) fraction of top-lateral comparisons.

It would be good to provide these to nearest ~10% as there is a 100% chance multiple teams have already probed them with inference time and cached related queries—and they’re all discoverable without “printing data to logs” through timing tells, especially with unlimited timeouts and visible tqdm logging.

@mike-dd It is stated we can only use information from specific query and database for each scenario inference, does it apply we can use our train images database information with its cached embeddings as well during each scenario inference process? From the discussion, it seems that is allowed but I still have doubts.I just wanted to confirm.
Thank you for the clarification.

Hi @flamethrower,

Yes, you can use cached embeddings from the training images during inference.

Thank you for the prompt response.