Issue with Supervised random forest classification

you can use point samples, but as you say, if you have less than 5000, each tree is generated based on all available points and the advantage of permutation of a random forest classifier is partly lost (it also shuffles the input rasters).
You can create polygons around your samples to increase the number of potential samples. Your points collected in the field are a good base for this and I think it is scientifically correct to extend this points to representative area around.

If you have enough samples, you can also leave some of them out of the classification process and later use them for validation (did the classifier predict the correct class at the sampled location)?

1 Like