Supervised classification - train on raster

Hello everyone,

I am trying to run supervised classification (Random Forest) on subset of resampled Sentinel-2 image for educational purposes, and I want to use option “Train on Raster”. I don’t have any trouble to run it or something like that, but there is one big issue I cannot solve. I want to use the first level of nomenclature from CORINE Land Cover (raster with pixel size of 100 x 100 m) as training classes, but in my case my subset (30 x 30 km) contains in majority of area only classes 1, 2 and 3 and there is only a few areas, where are the classes 4 and 5 presented. So when I run the classification and choose for example 50000 Number of training samples, the classifier will be trained only on the first three classes, because the selection of training pixels is apparently random and the classes with minimum number of pixels are not selected.

Is there any solution for this? Or maybe could there be added an option, which would allow to select training pixels randomly from all classes and not from the whole raster?

Thank you for any response.


I see the problem of sparsely represented classes. Would it work if you digitize areas with more or less equal spatial occurrence of all classes and use this geometry as a mask for the valid pixel expression in the band properties before you use it as training?

Sorry for my late response. Yes I think this could be solution, although I didn’t test it yet. Thanks for your sugestion even though I am not the biggest fan of this. But I guess the only other option is to increase the number of training samples and hope for the best.