Back to DrivenData | Blog

Submission format

Question re: required format for submission.

I’m saving my predictions as tiff images, 1024px a side, single band (shape = 1024x1024), max value 1, min 0, no floats. dtype int32 (tried using PIL im.save and imageio save). All images zipped into submission.zip (no directories). No luck getting a score so far - are we required to submit some sort of geo data along with the submission?

The error I get when submitting is:

Screenshot%20from%202019-12-21%2020-23-44

All 11481 IDs are present, names as instructed eg: 0a0a36.TIFF

Could it be an image size image? Uncompressed, they are rather large (4.2MB each).

Perhaps an example of saving an array to a TIFF file could be provided?

Geodata isn’t required for prediction tiffs.

4MB per image file seems too large - they should be more like 10-20kb per 1024x1024 1-band chip in TIFF file format with a total zip archive filesize between 100-200mb.

Are all pixel values either 0 or 1 in all chips? And can you try saving as dtype uint8?

All pixels are either 0 or 1.

4MB was the uncompressed size - they compress down very small as expected.

Changing to uint8 before saving shrank the uncompressed to 1MB (~87MB for the whole zip after compression) and solved the problem described above - thanks for the help.

I then got a different error: “Found unexpected file ‘fe7ba8.TIFF’.” I used the naming convention from the problem description. Changing from .TIFF to .tif finally allowed me to submit, and get a score :slight_smile:

1 Like

Hey @johnowhitaker,

Awesome, glad the submission worked! The original submission timed out partly because of the large file sizes. Here’s an example of how we wrote small masks from numpy arrays using the Pillow library:

from PIL import Image 

# id of the tiff to write
file_id = 'abcdef'

# for now, just random data
arr = rng.choice(a=[0, 1], size=(1024, 1024), p=[0.95, 0.05]).astype(np.bool)

# use compression='tiff_deflate' for space savings
Image.fromarray(arr).save(f'{file_id}.tif', compression='tiff_deflate')

We’ve increased the scoring time limit to make it less likely this happens in the future and ensured that any variation of tif,tiff, TIF, and TIFF will work for the extension. Let us know if you have any future issues!

Thanks,
Peter

2 Likes

Thanks @bull! In addition to using PIL to save tiffs with deflate compression, here’s a quick gist showing use of rasterio to save prediction arrays as geotiffs with LZW, jpeg, or other supported compression. Again, submission are not required to have geodata but this would be a geospatial way to do it: https://gist.github.com/daveluo/e76fc89d59d166c8e839541262f6cd4b

The notebook also shows a quick end-to-end workflow that you can drop your inference code into from loading test chips via STAC to packaging everything into the correct submission compressed .tgz file (verified via a LB submission that evaluated correctly).

2 Likes

@bull is it possible to add RLE based csv submission format ?

@SamSepiol nope, but feel free to share any issues you’re having with the archive of TIFF format in the forum and hopefully the community here can make that format work for you!