Custom descriptor_eval.py and retrieval evaluation requirements

simple_machine · January 10, 2023, 1:11am

I have two questions about the retrieval evaluation requirements for the descriptor track:

Can we customize descriptor_eval.py, for example by using the provided SSCD score_normalization from vsc/baseline/sscd_baseline.py? If we do something like this, how does the scoring work? Does the runtime use provided code to evaluate the uploaded embeddings, and also execute the main.py separately? Or does it use some default code to evaluate uploaded embeddings and just run main.py separately to verify it executes?
Assuming we can customize evaluation, can we do additional computation following computation of embeddings (as part of evaluation)? In my case, I would like to compute embeddings, then combine query and reference embeddings with some (very cheap) function. This would output some meta-embeddings which could be fed to e.g., _global_threshold_knn_search as normal. Downside is that the meta-embeddings would exist per query-reference pair, and I’m not sure if that violates the “one embedding per second” requirement. I would consider it part of the retrieval lookup function. Any guidance on this?

chrisk-dd · January 10, 2023, 5:10pm

Does the runtime use provided code to evaluate the uploaded embeddings, and also execute the main.py separately? Or does it use some default code to evaluate uploaded embeddings and just run main.py separately to verify it executes?

The runtime runs the entrypoint.sh shell script, which first attempts to run the provided main.py if it exists to generate a subset of query descriptors, then runs descriptor_eval.py to evaluate this subset against the appropriate subset of the ground truth. It then runs descriptor_eval.py on the entire set of submitted query and reference descriptors.

Because the same retrieval must be run for all participants in the same way, it is not possible to customize descriptor_eval.py.

Can we do additional computation following computation of embeddings (as part of evaluation)?

I believe this is answered above, as it is not possible to modify descriptor_eval.py. In addition, please note the independence criterion in the rules:

Submitted descriptors for a video may not make use of other videos (query or reference) in the test set.

Please let me know if you have any additional questions or clarifications!

-Chris

simple_machine · January 10, 2023, 8:06pm

Thanks that answers everything!

Topic		Replies	Views
Code execution submissions and eligibility for Phase 2 Video Similarity Challenge	3	302	February 20, 2023
Question to orgs about the metric Image Similarity Challenge	2	470	August 4, 2021
Clarification on Descriptor Track Submission Format Video Similarity Challenge	8	341	January 23, 2023
L2 norm in feature descriptor Video Similarity Challenge	2	241	March 4, 2023
Facing problems in submission Video Similarity Challenge	5	235	January 19, 2023

Custom descriptor_eval.py and retrieval evaluation requirements

Related topics