In the submission_format file, there is a row for each frame of a video. As the fisherman places a fish on the ruler, the fish shows up in multiple adjacent frames. However, in the annotation for a video in training set, it seems for each unique fish, only one of the adjacent frames is labeled as having a fish. My question is that whether other adjacent frames should be predicted as having a fish in it in the submission file? When computing the first part of the sore, is a sequence of unique fish in a video fed into the formula of the score?
Hi @XiaokangWang, yes, you should predict for each of the adjacent frames where you can clearly see the fish. As long as the
fish_number is the same, those frames will be grouped when creating the sequence for the edit distance part of the metric!
Hi bull. Thanks for your clear reply.
If we consider the only cases when we clearly see the fish, should we include following cases:
- when the worker’s hand is above the fish and partly overlaps it
- when the fish moves because worker throws the fish away?
Thank you very much.