Classification of GRD product

The performance should be the same but python is known to struggle with the memory allocation and clearing. While GPT fully clears the cache after each processed product in the list (java machine is restarted), python accumulates temporary variables and becomes slow. One reason for this is that python often only uses one processor core to compute (instead of parallel processing) which also makes it potentially slower if you have a strong machine with multiple cores.
A far as I know, there are plans to tackle these problens with the next releases of SNAP, but there is no date for this yet.

To automate the calling of XMLs by the GPT, but with python scripts seems a good trade-off. It is nicely described here:

You can also write a batch script for this (instead of a python script) as described here:

Also make sure that gpt makes the use of your computational resources (~80% of your RAM is usually suggested). It is described here:

1 Like

@ABraun 1. Here, in batch files I couldn’t understand how I must give input(which I got as out put from gpt) to the external_command.exe file . I mean in external_command.exe python script how should I write code to take it as input. Kindly, show me with an example external code.
2. Here, we are reading and writing product two times ( 1st in the gpt and 2nd in the external_command ) . Due to this I thought this whole process will take much longer time compared to running one full code with pre-processing and analysis included using snappy. Is this true ?

hard to tell without knowing what exactly is your external command and what input it takes. Please explain how far you got and at which point you struggle.

Unless all of the data is processed by snappy (and the external command within python) it is probably not avoidable to write the data as an intermediate product before calling the external command.

But this does not necessarily take more time because once an intermediate product is written, the memory (RAM) used to store it is cleared and your machine runs faster for the subsequent steps. Processing too many steps within one chain can be slower as well. So mabye one “breakpoint” is not so bad. And after the process is finished, you can easily delete the intermediate files (also possible from within python).

I am expecting gpt to do preprocessing work(apply orb file - calb - spk filt - terrain flattening- terrain correction- sigmma0 ) to the raw data. I want this finally generated sigmma0 data to be used in my further analysis/processing.
I wrote a python code to carryout my analysis. It include multi segmentation thresholding approach to generate flood inundation map.
I want the whole process to be atomatic ranging from raw data to flood inundation map generation. But here, my whole work is divided in to two parts one is being carried out in gpt(preprocessing) and other part is being done in python(analysis/processing) . Here, i am facing problem in attaching both of them for continous processing .

You can call the gpt from within python as well

import os

cmd = "gpt preprocessing.xml input_file intermediate_output"

outfile 2 = python.processing(intermediate_output final_output)
1 Like

As @ABraun suggested you can invoke gpt from python as a separate process, but you can also use the API of the Graph Processing Framework.

Some examples can be found here (by @antonio19812 ):

Also these forum threads can be helpful:

In “2. 1. original --> callibrated -> glcm --> terrain correction” can I include terrain flattening after glcm and before terrain correction to improve the accuracy of the classification.

  1. When classifying GRD product using algorithms like maximum likelihood or random forest classifier. Is it necessary to convert the data in to decibels after terrain correction? How does it effect the results ?
  2. Out of sigmma0 and gamma0 product which suits better for generating binary image of water and non-water using Maxinum likelihood, random forest classifier and thresholding technique ?
  1. you can only calculate terrain flattening if you calibrated to beta0.

  2. maximum likelihood is not suitable for radar data because the data is not distributed equally (vh is generally lower than vv), random forest works better. For both cases, conversion to dB is helpful because it changes the histogram [please see the examples here and here]. If you want to include image textures, random forest is necessary because it is based on thresholds and not of clusters. If you don’t want to include textures, please have a look at these comments on feature space here.

  3. Please have a look at the difference between Sigma0 and Gamma0 here. As these refer to topographically induced radiometric distortions (and water bodies are usually flat), it barely makes a difference. Calculate a scatter plot between Sigma0 and Gamma0 to see where they are different.

I am aware of it, but I want know will the accuracy improve if the terrain flattening is done on GLCM products before classification.

Gamma0 is more likely to represent the actual backscatter of a surface (impact of topography is reduced), so basically yes. But for water bodies, the difference will be small.

Hello @ABraun !

import os
cmd = "gpt preprocessing.xml input_file intermediate_output"

this works perfectly for me . Thank you.
Now, the issue is I am using internet with proxy settings. So, my command prompt is not able to get the internet access. From internet I came to know that we have to give this command to cmd prompt to access internet :

set http_proxy=http://username:password@your_proxy:your_port

When I enter this command in cmd it works fine but i want to pass this from the python code itself as we did for the gpt. How can I achieve this ?

I don’t know because haven’t done this so far, sorry.
Maybe someone other can help.

It’s ok. Thank You.

please have a look at this page:
At the bottom is described how you can configure the proxy without using the SNAP GUI.
If you can use the GUI you can go to Tools / Options in the menu. Switch to the WWW tab and configure the proxy.

hi team,
I’m very sorry for this question,
while doing the supervised classification getting the error as source products are of different dimensions.
following the above discussions but I’m not able to do rectify.

could you please help me to overcome this.

thank you in advance.

This error is discussed in here, read the post carefully many solutions and suggestions are available

Source of the post

Here also, you could find the step of the classification,

Source of the post

sir from the above he did with sentinel 1 and sentinel 2,
but I’m doing for only 3 sentinel 1 images.

thank you in advance

you have to make sure that all your rasters are in the same coordinate reference system (the one you selected for terrain correction)

You can check using this one grafik if all of them have the same projection information.

As @ABraun mentioned be sure

But in your case I think your input is S-1 GRD, is that right?

In this case your vector, as it is your train on vector, the reference system should be same to S-1 Lan/Long EPSG:4326 WGS84