@polymathAB and @uamo Not sure if there’s a version mismatch at play here but I’m having trouble reproducing this error. Can you share your pytorch model .pt files from the benchmark notebook and post a pip freeze to show the versions you have been using?
I have trained an efficientnet model on my local and saved as ckpt file. The error Traceback shows -
File “/srv/conda/envs/condaenv/lib/python3.9/site-packages/segmentation_models_pytorch/unet/model.py”, line 65, in init
self.encoder = get_encoder(
File “/srv/conda/envs/condaenv/lib/python3.9/site-packages/segmentation_models_pytorch/encoders/init.py”, line 62, in get_encoder
encoder.load_state_dict(model_zoo.load_url(settings[“url”]))
File “/srv/conda/envs/condaenv/lib/python3.9/site-packages/torch/hub.py”, line 524, in load_state_dict_from_url
download_url_to_file(url, cached_file, hash_prefix, progress=progress)
File “/srv/conda/envs/condaenv/lib/python3.9/site-packages/torch/hub.py”, line 394, in download_url_to_file
u = urlopen(req)
File “/srv/conda/envs/condaenv/lib/python3.9/urllib/request.py”, line 214, in urlopen
return opener.open(url, data, timeout)
File “/srv/conda/envs/condaenv/lib/python3.9/urllib/request.py”, line 517, in open
response = self._open(req, data)
File “/srv/conda/envs/condaenv/lib/python3.9/urllib/request.py”, line 534, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File “/srv/conda/envs/condaenv/lib/python3.9/urllib/request.py”, line 494, in _call_chain
result = func(*args)
File “/srv/conda/envs/condaenv/lib/python3.9/urllib/request.py”, line 1389, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File “/srv/conda/envs/condaenv/lib/python3.9/urllib/request.py”, line 1349, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno -3] Temporary failure in name resolution>
ERROR conda.cli.main_run:execute(33): Subprocess for ‘conda run [‘python’, ‘main.py’]’ command failed. (See above for error)
I am also using same versions of the packages as above ^. Also in the baseline notebook, it has been instructed that a checkpoint file has to be saved in the assets sub-dir. In the utility code provided too, it used the PyTorch lighting module’s load_from_checkpoint function.
I suspected that the issue was with using a pretrained resnet34 backbone no? is there any way to include the resnet weights (.pth file) in the assets dir so that we don’t have to download at all? I was looking into pytorch lightning loading pre-trained weights code and it always uses internet for downloading them as far as I could explore
EDIT 2: Issue was solved after putting resnet34 file in assets and copying to container in submission. However score received is 0. Is this right? Since validation IoU was >0.44
EDIT 3: According to my logs, now the test_features directory was not read by the main.py at all. Maybe it is a bug in the path definitions in flood_model.py and main.py provided in baselines? Please help as to a resolution
We fixed a couple other issues along the way and the blog will be updated within the next few minutes. Thanks for bearing with us on this - there were a lot of layers to dig through in the pytorch hub and segmentation models code to figure out the best way to prevent a download.