Facing problems in submission

bibhabasumohapatrabm · January 19, 2023, 8:35am

I am not able to make a successful submission for my baseline. I wanted to submit simple 1-d embedding /descriptors for every middle image of every video.
but I am not able to read videos as my function cannot read an empty array and gets none type .
To debug my submission script I wrote “print() statements”
where,

if query_video_ids[0] in os.listdir(QRY_VIDEOS_DIRECTORY):
        print("### yes its there #####")
        print(query_subset_video_ids[0])

IN THE LOGS

File “/code_execution/main.py”, line 55, in main
if query_video_ids[0] in os.listdir(QRY_VIDEOS_DIRECTORY):
FileNotFoundError: [Errno 2] No such file or directory: ‘/data/query’

where I expected list of videos like /data/query should have Q100001.mp4 etc.
Variable Names are same as in the main.py script example given in instructions

bibhabasumohapatrabm · January 19, 2023, 8:45am

even My submission of main.py to generate empty descriptors is failing. What I am doing wrong in this case.
here is the code main.py. (I submitted submission.zip which had a main.py file.

from pathlib import Path
import pandas as pd
import numpy as np

ROOT_DIRECTORY = Path("/code_execution")
DATA_DIRECTORY = Path("/data")
QRY_VIDEOS_DIRECTORY = DATA_DIRECTORY / "query"
OUTPUT_FILE = ROOT_DIRECTORY / "subset_query_descriptors.npz"
QUERY_SUBSET_FILE = DATA_DIRECTORY / "query_subset.csv"


def generate_query_descriptors(query_video_ids):
    video_ids = query_video_ids
    descriptors = np.zeros(shape=(len(video_ids),256)).astype(np.float32) 
    timestamp_intervals = [0 for x in range(len(video_ids))]
    return video_ids, descriptors, timestamp_intervals


def main():
    # Loading subset of query images
    query_subset = pd.read_csv(QUERY_SUBSET_FILE)
    query_subset_video_ids = query_subset.video_id.values.astype("U")

    # Generation of query descriptors happens here
    query_video_ids, query_descriptors, query_timestamps = generate_query_descriptors(
        query_subset_video_ids
    )

    np.savez(
        OUTPUT_FILE,
        video_ids=query_video_ids,
        features=query_descriptors,
        timestamps=query_timestamps,
    )


if __name__ == "__main__":
    main()

chrisk-dd · January 19, 2023, 4:19pm

Hi @bibhabasumohapatrabm-

Apologies for this error - there was a typo in the folder name on the code execution environment leading to an empty folder of videos when you attempted to access them in the environment. This has now been corrected.

In addition, after taking a look at your submission, it appears that you have not included the query and reference descriptors as required by the Code Submission Format for the descriptor track of the competition. This will be required for your submission to evaluate properly.

Please let me know if you continue to experience any issues!

Thanks,
Chris

bibhabasumohapatrabm · January 19, 2023, 6:02pm

thanks for the update.

after taking a look at your submission, it appears that you have not included the query and reference descriptors as required by the Code Submission Format for the descriptor track of the competition. This will be required for your submission to evaluate properly.

actually in the main.py script Link. It mentioned
" Our compute cluster will also run main.py to measure computational costs and performance on the test subset. Your main.py script just needs to write out a subset of query descriptors to the same format as the .npz files you are already submitting."

So, I need to only write the main.py that works for a subset of query descriptors, thus it will generate the query and reference descriptors for test data on execution Right? Please correct me if I have not understood right or mixing up things.

PS : I am trying to do inference through main.py and doing the same will generate query descriptors and reference descriptors, will uploading the submission with main.py and model_assets will generate the descriptors and submit? context above main.py code

chrisk-dd · January 19, 2023, 6:21pm

I don’t believe that is correct. Your submission.zip should look like this (as documented here):

submission.zip                    # this is what you submit
├── query_descriptors.npz         # npz file containing query set descriptors
├── reference_descriptors.npz     # npz file containing reference set descriptors
├── main.py                       # your script that will generate descriptors for
│                                 #   a subset of test set query videos
└── model_assets/                 # any assets required by main.py script, like a
                                  #   model checkpoint

Your main.py script will only operate on a subset of query videos, not on the full query set and not on any reference videos. You must submit the full set of descriptors for all query and reference videos. Your submitted reference descriptors will then be used to conduct a similarity search, against the full set of query descriptors you submit as well as against the subset of descriptors generated during code execution.

bibhabasumohapatrabm · January 19, 2023, 6:28pm

Thanks, I understand now, I have to submit a query and reference descriptors and there is no way to infer to generate the descriptors. The only way to submit is descriptors with corresponding main.py and model_assets.
Thanks chris

Topic		Replies	Views
Same submission but failed in phase 2 Video Similarity Challenge	2	234	April 6, 2023
Code execution submissions and eligibility for Phase 2 Video Similarity Challenge	3	295	February 20, 2023
Error when running "make test-submission" Video Similarity Challenge	4	271	December 20, 2022
Validation error: arrays lenghts Video Similarity Challenge	4	296	February 8, 2023
Your submission did not output the expected file so it could not be scored. This may be due to an unhandled exception or syntax error in your code. The log output may have more details Video Similarity Challenge	2	260	February 17, 2023

Facing problems in submission

Related topics