Hi!
I am experiencing some issue when trying my test-submission script.
I am getting this error:
main.DataValidationError: Arrays lengths for query do not match. video_ids: 8295; timestamps: 682608; features: 341304.
This obviously comes from the /opt/validation function:
def validate_lengths(dataset: str, features_npz):
n_video_ids = len(features_npz["video_ids"])
n_timestamps = len(features_npz["timestamps"])
n_features = len(features_npz["features"])
if not (n_video_ids == n_timestamps == n_features):
raise DataValidationError(
f"Arrays lengths for {dataset} do not match. "
f"video_ids: {n_video_ids}; "
f"timestamps: {n_timestamps}; "
f"features: {n_features}. "
)
However, when reading the Code Submission Format page, I understood that the timestamp sub-array is like this:
timestamps
is a 1D or 2D array of timestamps indicating the start and end times in seconds that the descriptor describes.
The code in my main.py looks like this:
def generate_query_descriptors(query_video_ids) -> np.ndarray:
# Initialize return values
video_ids = []
timestamps = []
descriptors = []
# Generate descriptors for each video
for i in tqdm.tqdm(range(query_video_ids.shape[0])):
try:
video_id = query_video_ids[i]
video_file = f'{QRY_VIDEOS_DIRECTORY}/{video_id}.mp4'
start_timestamps, end_timestamps, qry_descriptor = extract_descriptor(video_file)
descriptors.append(qry_descriptor)
timestamps.append(np.hstack([start_timestamps, end_timestamps]))
video_ids.append(video_id)
except Exception as e:
print(query_video_ids[i], e)
descriptors = np.concatenate(descriptors).astype(np.float32)
timestamps = np.concatenate(timestamps).astype(np.float32)
return video_ids, descriptors, timestamps
Where the start and end descriptors come from:
start_timestamps = np.array(tuple(start_timestamps.values()), dtype=np.float32)
end_timestamps = np.array(tuple(end_timestamps.values()), dtype=np.float32)
Any hints at what might I be doing wrong?
Thanks!