How to save the Random Forest Classifier and load it for future classification

I have trained and applied the Random Forest Classifier in SNAP toolbox. After classification is done, I want to use the trained model on a different image. How can I save the trained model to use it on a different image? How can I load and apply classifier?

in case you have imported vector data, it’s possible to save it. Right click on the product and click on Save Product. The vector data folder of product should be as the following. Expand the product and open the Vector Data folder to check it.

Similar issue in the following post

Source of the post

Whenever you perform a new RF classification you are asked to enter a name for the classifier in the second tab. If you later have an input raster with exactly the same bands (same names), you can click “load and apply classifier” and select the RF model you previously created.

grafik

1 Like

Thanks @falahfakhri, the link you provided has some very good information.

Thanks @ABraun, that answers my question. Another query that I want to know is if there is a way to determine overfitting of the trained model in Random Forest?

unfortunately, not. The parameters in the module are quite limited as you see (number of trees) and don’t allow pruning of the trees.
From my experience, the higher the number of trees, the lower is the impact of single over-fitted trees, because the final result is then averaged to an ensemble result.

Additionally, the parameter to control over-fitting best is the share of training samples per tree. In SNAP it is predefined as the square root of the total number of input rasters (2 of 4, 3 of 9 4 of 16, and so on). This is a quite small fraction and some programs allow to have larger proportions. Based on this, over-fitting is nearly impossible as long as you have a considerable number of training bands.
If you only have 2 input bands, over-fitting is quite unavoidable.

Breiman argues similarly: https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm

The only thing you can do is to check “evaluate classifier” and have a look at the feature importance textfile later. It tells you about the impact of single training features (rasters). It does not tell you about the degree of over-fitting but at least about the dominance of single input rasters.

1 Like

Thanks @ABraun. I am also considering using scikit-learn to expand further on the statistical analysis with RF. However, for now, I will be using SNAP’s RF for testing purposes.