Rndom forest classification steps

marjanmarbouti · October 22, 2017, 11:08am

Dear all,
I plan to random forest classification over Tandem-X data by SNAP software.
First of all, I applied ‘Grey level Co-occurence matrix’ on Tandem-x backscatter image and I got results in ‘contrast’, ‘dissinitarity’, ‘Homogenetory’ and so on. You can see my steps in below.

But after it, I plan to apply different classifications methods especially random forest on my result from ‘Grey level Co-occurence matrix’ in SNAP but I do not know how should I start?!
Would you please explain what is the meaning by options in below box? What is difference between training band and feature band? What should I put there?
I did not find any example about this.

MCG · October 23, 2017, 7:14am

If you use the ‘Train on Vectors’ options, your RF classification will ask you for shp that represent training areas (same as other many supervised classification methods). For that you need to create those shp on SNAP or import them. Personally, I do not have experience on the ‘Train on Raster’ option.

If you use the ‘Train on Vectors’:
Training bands ask you to identify which is the shapefile(s)/vector that have to be used for training (the geometry).
Feature bands ask you to identify which bands have to be used for training (in combination with the geometry of the training bands).

Hope its clear
M

ABraun · October 23, 2017, 7:28am

you could, for example, add a landcover band (right click on the product) and let SNAP train your SAR data based on these raster values. But these are automatically generated maps at global scale so they won’t be very good for randomized selection of training points.

johngan · October 23, 2017, 10:34am

Hi,

In order to perform a supervised classification such as random forest in SNAP, you have to collect training datasets for each class as shown in the image below.

The image above shows different sea ice types and the vector files are the training data. Training data (shapefile format) can be created either using SNAP or QGIS. If training data are not specified, random forest cannot be performed.
The parameters for random forest in SNAP are specified as follows:

We see in the first box, we select the training data we have crated for each sea ice type and at the second box, we select the bands (GLCM bands).
If we press run, random forest algorithm runs without any issues

I hope this helps

marjanmarbouti · October 23, 2017, 3:25pm

Dear @johngan
I have three images (not only one) that I want to use three images for doing classification.
Image1:

Image2:

Image3:

but when I put them with together for doing classification, then I get this error. How should I change their dimension?

MCG · October 24, 2017, 6:51am

Hi @marjanmarbouti,

The error is showing up because most probably your images do not have the same extent/resolution. Make sure those are common parameters for your inputs and then try again.

johngan · October 24, 2017, 8:30am

In order to perform classification with more than one layer, they should have the same dimensions. If this is not the case, you can use a shapefile with the area of interest and crop all of them, so that they have the same dimensions.

In your print-screens, I can see three layers, two of them is phase and coherence (if i am not mistaken). I do not know what you are trying to achieve, but why are you using the phase layer for classification?

For classification, what you need is the intensity of the image. If you want to improve the classification results, you can do some texture analysis (GLCM) and combine both image intensity and GLCM layer.

marjanmarbouti · October 24, 2017, 8:58am

Thanks my friend. Yes you are right. I used other GLCMs but I was curious about phase as well because I saw good result on one part.
As I told you I have three images from same location and I choose training data on one of them (they are added to vector section). As you can see in below image.

But I want to add same training data (same amount, same location) that I choose on above image on other images.
What should I do?

johngan · October 24, 2017, 9:04am

When you say you have three images, you mean three SAR image of the same location captured in different periods?
Then you should produce three classification images and compare to each other? is that what you are trying to do?
If yes, then you follow the same process for all of your images.

collect training data for each class
perform classification for each image
compare

MCG · October 24, 2017, 9:07am

On the other hand, if the objective is to use the three images to create only 1 classification output, it is enough to add the training data to one of them only. No need to duplicate the training data in each one

marjanmarbouti · October 24, 2017, 9:10am

Actually I have one backscattering image with its coherence and phase (it means images are in same date and even second). I plan to apply same training data on coherence and phase images that I applied on backscattering image and then do classification with 3 images (intensity, coherence and phase) simultaneously.

marjanmarbouti · October 24, 2017, 9:18am

Yes my friend. I got it now. I think it is something you mentioned it. I hope I did well.

marjanmarbouti · October 24, 2017, 10:03am

Only one thing @johngan ;
I am wondering that if we have two images in same location or two images (backscattering and coherence in my case) and we added training samples (water, fast ice, close ice and others) on vector in backscattering image.
Is it possible use training samples on vector in backscattering image for doing classification ONLY on coherence image?
In my idea, it is possible by the way in below. Am I right?

johngan · October 24, 2017, 11:01am

If you have datasets for a region and you derived the coherence image from those same datasets, then you need to create training data only once (lets say using the backscatter image) , you do not need to replicate the training data for the coherence image

marjanmarbouti · October 24, 2017, 9:11pm

Thank you for answers
I only have another question.
As I look at the result of my classification (Random forest) with backscattering image. It works (you can see the result in below) but I do not know why when I add backscattering image in ‘productset reader’, whole options (I mean type, acquisition , track and orbit) are empty.
Is this a problem? Although I think classification is working even without appearing these options.

Image1: Adding backscattering image in ‘productset reader’

Image2: My classification result (Random forest) with backscattering image

ABraun · October 26, 2017, 6:26am

you can add these information by clicking on these blue arrows

ngtrananrsc · February 23, 2018, 11:37am

Hi all,
When I am using this Random forest classification process with only 2 vector data classes water and non-water, I see this error “bound must be positive” as in the screen capture below.
I try to do again all the previous steps as you said above and I even check the Pixel Info, the intensity of pixel in each band > 0 but this error still appear.
Do you know how to solve this problem?
Thank you in advance.

zengsalma · May 8, 2018, 11:47pm

hi, I also encountered the same problem. Did you solve it?

Thank you.

ABraun · May 9, 2018, 5:38am

it has something to do with the projection of your data. Which coordinate reference system did you select in the terrain correction step?

ngtrananrsc · May 9, 2018, 3:51pm

Yes I solved it. Because in the terrain correction step I chose WGS 84, then it happend. So I need to reproject my data. Go to Raster >> Geometric Operations >> Reprojection, and then choose the project as Geographic Lat/Lon (WGS84)