Back to DrivenData | Blog

Getting Started

Thanks for sharing ! it is a great help !

How did you choose the zoom level (19) and tile_size (256) ?
My first thought would be to use the tile size of the test set (1024)

1 Like

Tensor size of (batch_size,1024,1024,3) is overkill imo.
Microsoft RefineNet made use of 256x256 images with 1ft/px resolution for the same task as this competition.
You could go higher… like 512x512 higher, but anything above that is picking up detail that may not be that of an inprovement in final results.

1 Like

I upped it to 512px (but think I left the folder names 256) for opencitiestilesmasked. 1024 is a lot of data for training - I figured most would be down-sizing the test tiles or splitting them into multiple sub-tiles anyway.
There is definitely room for experimentation around the zoom level etc. 19 was used in a tutorial I found, and I stuck with it because it made sense for me - visually inspecting the images shows that buildings at that scale look large enough to be easily identified, but are still small enough that several fit in each image (as opposed to a higher zoom that might only show half a building at a time). One thing I don’t like about this approach is that the test tiles are provided at the resolution that they were caputed at - in other words, not a consistent zoom level. So a model needs to be robust to buildings at different scales - it might be better to preserve the differing resolutions of the raining data and slice it up some other way.

1 Like

This is spot on! A key part of the challenge is to develop more robust models working across a range of resolutions, imagery capture conditions, diverse geographies. Your ideas to incorporate training data at native resolutions and of different sizes (256x256 up to 1024x1024) could be promising, and could be achieved using rasterio’s windowed reads:

In general, windowed reads should offer some more flexibility to chip the image rasters to whatever size and shape you like. There’s a small example of a 1024x1024 window read at native resolution at the bottom of the pystac starter colab notebook provided in the STAC Resources competition page.

One other thing to be aware of if you’re creating training chips to standardized webmap tile zoom levels and squares is that there’s almost always some resampling and reprojection (affine transform) happening from the original image as a resut. Not saying if that’s good or bad for your model performance…I don’t know, worth experimenting! For instance, here’s how rio-tiler works under the hood:

1 Like

Hi @daveluo_gfdrr and @johnowhitaker,

I’ve been having issues installing and importing solaris. From the printouts it seems like it installs, however when importing it i get the error message that no module is found.

I’ve been pip installing on a google CoLab while going through both of your example notebooks. Do you have any insight into why this might be or other resources to check out? I can’t seem to find anything on common resources like stackexchange, etc. Thanks in advance for anything you might have to add!

Hi @butlerbt,

Could you double-check the full printout from !pip install solaris? If you’re installing onto a fresh Colab instance, you may be running into an error like the below while trying to install the gdal>=3.0.2 dependency:

ERROR: Command errored out with exit status 1: python egg_info Check the logs for full command output.

This popped up recently with the most recent version of solaris (0.2.1) requiring gdal 3 which can be temperamental to install.

The following should work (by adding the ubuntugis-unstable packages which includes gdal 3.0.2):

!add-apt-repository ppa:ubuntugis/ubuntugis-unstable -y
!apt-get update
!apt-get install python-numpy gdal-bin libgdal-dev python3-rtree

!pip install solaris

Or if you prefer the stable ubuntugis packages, you could install an older version of solaris for now:

!add-apt-repository ppa:ubuntugis/ppa -y
!apt-get update
!apt-get install python-numpy gdal-bin libgdal-dev python3-rtree

!pip install solaris==0.2.0
1 Like

Thanks for the info @daveluo_gfdrr. I noticed errors without libgdal on GCP, but hadn’t realized the version has also changed on Colab and broken the code there as well. Appreciate the clean fixes :slight_smile:

@daveluo_gfdrr your first solution worked like magic! Thank you!!! :pray:

Thanks a lot for sharing this.


I am facing this error when trying to run the code on Google Colab.

TypeError Traceback (most recent call last)
in ()
1 lr = 1e-5 # trying a v conservative LR quickly
----> 2 learn.fit_one_cycle(1, slice(lr))

15 frames
/usr/local/lib/python3.6/dist-packages/torch/ in array(self, dtype)
484 def array(self, dtype=None):
485 if dtype is None:
–> 486 return self.numpy()
487 else:
488 return self.numpy().astype(dtype, copy=False)

TypeError: can’t convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Doing return self.cpu().numpy() is not helping as well. Request your help

Are you running it on a GPU runtime? If not, you can change this in Colab by going to Runtime -> Change Runtime Type.

Thanks for your fast reply. I am running it on a GPU run time.

Full trace is as below

TypeError Traceback (most recent call last)
in ()
1 lr = 1e-5 # trying a v conservative LR quickly
----> 2 learn.fit_one_cycle(1, slice(lr))

15 frames
/usr/local/lib/python3.6/dist-packages/fastai/ in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, final_div, wd, callbacks, tot_epochs, start_epoch)
21 callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor, pct_start=pct_start,
22 final_div=final_div, tot_epochs=tot_epochs, start_epoch=start_epoch))
—> 23, max_lr, wd=wd, callbacks=callbacks)
25 def fit_fc(learn:Learner, tot_epochs:int=1,, moms:Tuple[float,float]=(0.95,0.85), start_pct:float=0.72,

/usr/local/lib/python3.6/dist-packages/fastai/ in fit(self, epochs, lr, wd, callbacks)
198 else:,self.opt.wd = lr,wd
199 callbacks = [cb(self) for cb in self.callback_fns + listify(defaults.extra_callback_fns)] + listify(callbacks)
–> 200 fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
202 def create_opt(self, lr:Floats, wd:Floats=0.)->None:

/usr/local/lib/python3.6/dist-packages/fastai/ in fit(epochs, learn, callbacks, metrics)
104 if not cb_handler.skip_validate and not
105 val_loss = validate(learn.model,, loss_func=learn.loss_func,
–> 106 cb_handler=cb_handler, pbar=pbar)
107 else: val_loss=None
108 if cb_handler.on_epoch_end(val_loss): break

/usr/local/lib/python3.6/dist-packages/fastai/ in validate(model, dl, loss_func, cb_handler, pbar, average, n_batch)
61 if not is_listy(yb): yb = [yb]
62 nums.append(first_el(yb).shape[0])
—> 63 if cb_handler and cb_handler.on_batch_end(val_losses[-1]): break
64 if n_batch and (len(nums)>=n_batch): break
65 nums = np.array(nums, dtype=np.float32)

/usr/local/lib/python3.6/dist-packages/fastai/ in on_batch_end(self, loss)
306 “Handle end of processing one batch with loss.”
307 self.state_dict[‘last_loss’] = loss
–> 308 self(‘batch_end’, call_mets = not self.state_dict[‘train’])
309 if self.state_dict[‘train’]:
310 self.state_dict[‘iteration’] += 1

/usr/local/lib/python3.6/dist-packages/fastai/ in call(self, cb_name, call_mets, **kwargs)
248 “Call through to all of the CallbakHandler functions.”
249 if call_mets:
–> 250 for met in self.metrics: self._call_and_update(met, cb_name, **kwargs)
251 for cb in self.callbacks: self._call_and_update(cb, cb_name, **kwargs)

/usr/local/lib/python3.6/dist-packages/fastai/ in _call_and_update(self, cb, cb_name, **kwargs)
239 def call_and_update(self, cb, cb_name, **kwargs)->None:
240 “Call cb_name on cb and update the inner state.”
–> 241 new = ifnone(getattr(cb, f’on
{cb_name}’)(**self.state_dict, **kwargs), dict())
242 for k,v in new.items():
243 if k not in self.state_dict:

/usr/local/lib/python3.6/dist-packages/fastai/ in on_batch_end(self, last_output, last_target, **kwargs)
342 if not is_listy(last_target): last_target=[last_target]
343 self.count += first_el(last_target).size(0)
–> 344 val = self.func(last_output, *last_target)
345 if
346 val = val.clone()

in jq(y_pred, y_true, thresh)
6 for i in range(len(y_true)):
7 binary_preds = (y_pred[i][0].flatten()>thresh).int()
----> 8 score = sklearn.metrics.jaccard_score(y_true[i].flatten(), binary_preds, average=‘micro’)
9 scores.append(score)
10 return torch.tensor(sum(scores)/len(scores))

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/ in jaccard_score(y_true, y_pred, labels, pos_label, average, sample_weight)
791 “”"
792 labels = _check_set_wise_labels(y_true, y_pred, average, labels,
–> 793 pos_label)
794 samplewise = average == ‘samples’
795 MCM = multilabel_confusion_matrix(y_true, y_pred,

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/ in _check_set_wise_labels(y_true, y_pred, average, labels, pos_label)
1299 str(average_options))
-> 1301 y_type, y_true, y_pred = _check_targets(y_true, y_pred)
1302 present_labels = unique_labels(y_true, y_pred)
1303 if average == ‘binary’:

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/ in _check_targets(y_true, y_pred)
79 “”"
80 check_consistent_length(y_true, y_pred)
—> 81 type_true = type_of_target(y_true)
82 type_pred = type_of_target(y_pred)

/usr/local/lib/python3.6/dist-packages/sklearn/utils/ in type_of_target(y)
245 raise ValueError(“y cannot be class ‘SparseSeries’ or ‘SparseArray’”)
–> 247 if is_multilabel(y):
248 return ‘multilabel-indicator’

/usr/local/lib/python3.6/dist-packages/sklearn/utils/ in is_multilabel(y)
136 “”"
137 if hasattr(y, ‘array’) or isinstance(y, Sequence):
–> 138 y = np.asarray(y)
139 if not (hasattr(y, “shape”) and y.ndim == 2 and y.shape[1] > 1):
140 return False

/usr/local/lib/python3.6/dist-packages/numpy/core/ in asarray(a, dtype, order)
84 “”"
—> 85 return array(a, dtype, copy=False, order=order)

/usr/local/lib/python3.6/dist-packages/torch/ in array(self, dtype)
484 def array(self, dtype=None):
485 if dtype is None:
–> 486 return self.cpu().numpy()
487 else:
488 return self.cpu().numpy().astype(dtype, copy=False)

TypeError: can’t convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Must be a problem with my silly metric - maybe leave that out. I think you can use dice (with iou=true) to get the proper jacquard index anyway. Sorry about that!

1 Like

Thanks for your reply. I will try that out

hello there, @daveluo_gfdrr
first of all thanks for your great efforts,
here :

i noticed that you used a zoom_level=19 when creating 256px tiles, what is the relation between the zoom_level and size of the tile? if i want to create 512px tiles, what zoom_level is to be chosen?
what zoom_level was chosen to create the 1024px test images?

1 Like

Hi @Hasan_N,

The zoom_levels at 256x256 correspond to these meters/pixel at the equator:

E.g. for zoom_level=19 and tile_size=256, it’s ~0.3m/pixel. If you increase tile_size to 512x512 at the same zoom, then m/pixel would be halved to ~0.15m/pixel because the area covered by each tile polygon at each zoom level remains the same.

In that tutorial, rio-tiler uses rasterio’s WarpedVRT under the hood to resample raster images from their native resolutions and projections to these standardized web map tiles but that’s not the only way to create training chips as mentioned in my earlier post.


Hi Dave @daveluo_gfdrr, Hello all,

I opened Accra 665946 scene using ERDAS and it shows that pixel size is 0.02 meters while in your official webpage description you mentioned that the resolution for this scene is 3 cm:

Using gdalinfo, I found the following:
Pixel Size = (0.020015187071028,-0.020015187071028).

Can you please confirm which one of those three (ERDAS, GDAL or your webpage) is accurate?


1 Like

Hi @aghandour, thanks for looking into it. Yes you’re right, that scene should be listed as 2cm resolution. We’ll correct that on the site.

Hi all,
here is the full list of train_taier_1 images resolutions (i hope it will be usfull for someone):
665946: 0.02001518707102818
a42435: 0.032029411960186015
ca041a: 0.035820209694930036
d41d81: 0.05179965064903244
401175: 0.07879995472604662
493701: 0.0420000000000015
207cc7: 0.0777476098863814
f15272: 0.039999999999998634
abe1a3: 0.2
f49f31: 0.2
4e7c7f: 0.03533483541714589
a017f9: 0.04162000000000368
b15fce: 0.05071999999999411
353093: 0.04833999999999547
f883a0: 0.05113
42f235: 0.07266
0a4c40: 0.07222
33cae6: 0.0774800032377243
3b20d4: 0.06792999804019927
076995: 0.06411000341176987
75cdfa: 0.06035000085830688
9b8638: 0.06646999716758728
06f252: 0.059179998934268944
c7415c: 0.06451000273227692
aee7fd: 0.07398000359535216
3f8360: 0.07880999892950058
425403: 0.07528000324964522
bd5c14: 0.07842999696731566
e52478: 0.0720999985933304
bc32f1: 0.07466000318527222
825a50: 0.1007506920751152

Resolutions of test images are removed, as i can see.


Hi everyone,

I think there was a question about how to create training image & label chips with windowed reads at native resolution. Here is some new starter code I added to the bottom of the notebook “A quick intro to accessing Open Cities AI Challenge data…” to show how you can use rasterio’s rasterize() and geopandas to do that:


1 Like