Two question about track2

qqerret · August 27, 2021, 2:47am

Hi,
Are we allowed to generate and modify every single 256d vector of test image with multi relative images in ref? If yes, everyone can reproduce the result in Track1.

When would we get the phase 2 dataset? If we can download them until the phase 2 starts, time of download is quite large during the final 48 hours. Is it possible to publish a zip of them with a password?

wenhaowang · August 27, 2021, 2:43pm

I remember that no matter in track 1 or track 2: “This means that even if the dataset had a single query image and a single reference image, the score of the image pair would be the same.”
Therefore, I think:
(1) Modifying one query image vector with many ref images breaks the rule;
(2) Though modifying one query image vector with one ref image, to be efficient, choosing the most similar one (or the second/third/… similar one) may be a must. However, ranking to get the most similar one needs the other ref images. That breaks the rule.

I’m not quite sure about the above statements.

wenhaowang · August 27, 2021, 2:59pm

“The scoring of a query image w.r.t. a reference image should be independent of other query images and reference images. This means that even if the dataset had a single query image and a single reference image, the score of the image pair would be the same. The intent of this rule is to avoid (1) that algorithms overfit to the reference set, for example by building a gigantic classifier with 1M outputs that predicts the matches, (2) that algorithms use irrelevant dataset statistics like the fact that there is at most one query image per reference image. This rules out methods based on query expansion [11] or neighborhood graphs [45].”

This passage from the competition paper may help.

qqerret · August 30, 2021, 9:15am

Ok, that’s what I need. How about the question 2? @mike-dd

glipstein · August 30, 2021, 3:43pm

@qqerret We’ll share more information about how you can access the unseen query set for Phase 2 in October. We appreciate that the phase is short and so participants will be looking to access the data quickly once it’s published.

qqerret · August 31, 2021, 2:52pm

Some questions here about the rule:

that algorithms use irrelevant dataset statistics like the fact that there is at most one query image per reference image

1.We train a model (not classifier) using ref data without aug but this model somehow remember the fingerprints of the 1M images. The informations of all fingerprints must work when inferring and the softmax breaks the rule that we can’t choose the most similar one. Is it legal in track 1?
(I think if we are allowed to use ref images to train a model the rule has been broken.)

2.We train a model (not classifier) using train data without aug but this model somehow get the fingerprints of the 1M images. And we use all this fingerprints to predict the score for a single pair of query & ref. Is it legal in track 1?

3.If we train a model (not classifier) using train data without aug but this model somehow get the fingerprints of the 1M images. And we use single fingerprint to predict the score for a single pair of query & ref. Is it legal in track 1?

4.We train a classifier model using train data and infer the hidden vector of query & ref image, and use another matching model to predict similarities between single query image and top 5 ref image to modify the vector. Is it legal in track 2? Or top 1 ref image？

5.If statement 4 is not legal, can I say we can only infer the vector like the dataset had only a single query image in track2, not single query image and a single reference image?

@glipstein @wenhaowang

wenhaowang · August 31, 2021, 3:17pm

I do NOT understand what do you mean by “footprints”

wenhaowang · August 31, 2021, 3:23pm

For statement 4, my understanding:
The process to choose the top 5 or top 1 ref image(s) breaks the rule that “if the dataset had a single query image and a single reference image, the score of the image pair would be the same.”
Because if we only have one query and one reference image, after getting the hidden vectors of the query and the reference image, we do NOT have any other images to modify the generated vectors.
The core point is that you CANNOT get the top 1/5 under the rule.

wenhaowang · August 31, 2021, 3:58pm

I do NOT understand the statement. Sorry!

qqerret · September 1, 2021, 3:17am

I think it is fingerprint. lol.

For statement 4, I mean in track 2 can we modify the vector by the top 1 image. “if we only have one query and one reference image”. Could the one reference be the top 1? Or we can’t use any reference image to train or predict the vector?

wenhaowang · September 1, 2021, 4:11am

It should be noticed that to getting the top 1, all the reference images should be used. Or, how can you know the selected image is Top 1? The method you pointed is not using reference images to train. As a competitor, I want to hear the judgement from the organizer @mike-dd @glipstein. Thanks!

glipstein · September 1, 2021, 3:24pm

@qqerret @wenhaowang As you’ve already pointed out, it seems like the rule is clear on this one.

Submitted individual predictions may not take into account more than one query image or more than one reference image at a time.

As mentioned in the paper,

The scoring of a query image w.r.t. a reference image should be independent of other query images and reference images. This means that even if the dataset had a single query image and a single reference image, the score of the image pair would be the same.

This is the case for both competition tracks.

wenhaowang · September 1, 2021, 3:35pm

Thanks! It helps a lot!

Topic		Replies	Views
About the reference dataset in Phase 2 Image Similarity Challenge	2	452	August 10, 2021
Are we allowed to fetch topk of each query Image Similarity Challenge	1	384	September 9, 2021
Training on Reference images Image Similarity Challenge	19	1086	September 30, 2021
Image Similarity Challenge - one month until Phase 2! Image Similarity Challenge	3	455	September 30, 2021
Score Normalization Image Similarity Challenge	2	418	September 20, 2021

Two question about track2

Related topics