Are there a minimum of 20 database matches for all test set queries?

flamethrower · May 8, 2022, 1:47am

Hello,

Please help me clarify, maybe it’s obvious but I’m not sure. Are we expected to return 20 database matches for each query in the test set or does the number of database matches to be evaluated for each test set query vary from a lower number to 20 as a max to be considered and we have to handle this situation.

For each query, is it return
a) 20 matches or
b) matches <= 20 depending on the number of matches for each query defined by a criteria?

Thank you.

Thanks.

mike-dd · May 10, 2022, 4:15pm

Hi @flamethrower,

Great question. I suspect others might wonder about the same thing.

The number of matches for each test set query can vary. It will not be 20 in all cases.

However, we recommend returning the maximum allowed 20 matches for each query because the additional matches (even if they are false positives) will not hurt your score. To see why this is the case, consider an example where there are 5 positives, and you have already found the first 4 with your first 4 guesses, resulting in a precision of 100% and recall of 80% up to this operating threshold. Additional guesses A, B, C in the diagram below will result in lower precision, but your recall will still be 80% for all guesses. Because AP is weighted by the increase in recall from the previous threshold, and the increase in recall is 0, these additional false positives do not penalize your score. But eventually returning the correct positive with guess D will improve your score.

In other words, if you were to restrict your submission to the first 4 guesses, your score could not be greater than 80%. Additional guesses can only increase your score.

I hope this helps.

flamethrower · May 10, 2022, 8:44pm

Thank you very much for the response. I didn’t take into account the weighted Recall properly, it seems.

On local testing, I also noticed same scores even when I tried to return varied number of database matches.

Much clearer with your illustration,

Topic		Replies	Views
Are we allowed to fetch topk of each query Image Similarity Challenge	1	384	September 9, 2021
Performance metric calculated incorrectly? Where's Whale-do?	1	300	May 5, 2022
Why the num of predictions always one less than my config? Image Similarity Challenge	0	252	September 7, 2021
The limitation of log Video Similarity Challenge	9	240	February 5, 2023
Dimensions - Track 1 vs Track 2 Image Similarity Challenge	1	391	August 2, 2021

Are there a minimum of 20 database matches for all test set queries?

Related topics