I have downloaded the comp data and I am currently trying to load the image patches into QGIS, however, the image data does not seem to have any projection information. It will be useful to have the projection information to allow for spatial operations on the data (mosaicing for instance). Or sampling data at points to train a tabular model that lends itself to faster feature selection. Please advise.
The rasters here have no geolocation info (you can confirm with gdalinfo
) so won’t be able to do projections.
Can I ask why you would want to do this? The data is nicely prepared already.
The geospatial stuff has mostly been hidden from us and it’s basically just a straightforward ML challenge.
If you really want to run a tabular model you can work in pixel space.
It is more useful for exploration, to better cope with spatial autocorrelation it is easier if the spatial info is available. There are also options to explore the data in GEE with other features (elevation, soil type) to come up with better training strategies maybe weight samples. Torchgeo geo samplers have benefits as well but need spatial metadata. Probably a whole bunch of other reasons as well.
Hi @Geethen! Please see the latest response regarding geolocation here:
Also note that the use of outside data, including elevation and soil type, are not allowed in this competition.
Good luck!