The dataset loader we wrote for the benchmark was meant for the preprocessed videos we provided, which have been converted to square resolutions and a small constant number of frames. The raw video resolutions vary in resolution and frame count, so in order to use them with the benchmark you will need to edit the data loader substantially, or downsample the raw videos into an appropriate format (and edit the loader a bit).