Data download from server/cli

Hello,

I’d like to know if there is a way to download the datasets using a command line on a server.

Thanks.

Hi. I was able to download the data using wget -O train_values.csv --no-check-certificate --no-proxy "url_link_to_csv". The url needs to be inside quotation marks.

3 Likes

Thanks a lot @davebel that works !!!

Are the links still valid - there have been some issues with obtaining data? An aws ls command indicates the bucket doesn’t exist.

The URLs at least seem to expire after 24 hours. Have you checked to see if the URL has changed since you copied it?

Kieran–

Thank you for the response. It seems there may be some other issue - the repository seems now to be entirely unavailable or inactive. Even aws cli attempts were unsuccessful.

Take a careful look at what happens to the URL when you paste it into your shell. Some shells will “helpfully” escape some of the URL characters like &, and that escaping will break the link if it is in quotes, but not if you leave the quotes out.

If you can click the link in the browser and the download starts, then your URL is not yet expired and there is some problem with what you pasted in the shell.

Initial attempts to obtain relevant data provided a redirect-link which permitted downloading/opening datasets only to eventually timeout. Most recent attempts to obtain data lend the following:

AccessDeniedRequest has expired864002020-09-09T22:17:35Z2020-09-14T21:15:21Z466A2211FDD5CD9B1m7jOH8IX0NHNQGSzIm/z+rEFuaw8bS0VQsH9dfivjBpNOANsQtNbPLiJlwzhLxXk3j6eh1DMOE=

Several attempts were made to obtain datasets via aws-cli to no avail.

It appears as though the datasets have been either removed or the accessibility otherwise restricted.

Anyone on this - several weeks in on obtaining data?

Does clicking on the download link on the data page start a download for you? (If you get access denied, refresh the page and try again—users get specific expiring URLs generated for them).
https://www.drivendata.org/competitions/63/genetic-engineering-attribution/data/

The wget command mentioned above also works if you are not on a machine with a web browser.

You can DM me the command you pasted in the shell, and I’ll take a look.

We haven’t had other reports of problems downloading the data, and have had many successful submissions, so I expect there is an issue with the command you are trying to run.

Thank you for your help!