Dopler terrain correction not running in gpt

divething · September 9, 2021, 10:01am

hi all, me again…
for those who follow, i have crossed all hurdles and got stuck here…
i have run the following: split, orbit, backgeocoding, esd, interferogram, deburst, topo phase removal , goldstain filter, snaphu (in and out), and the additional outputs recommended somewhere (phase to height , dem and displacement). now its time for terrain correction (TC) - and i cant make it work.

im running all this on a win10 computer with 128g ram and 12 cores. i am using python to edit xml graphs and running them via gpt … my snap is 8.0.0 (i regressed after the upgrade was not working - see Topographic Phase Removal Error)

so
i have run successfully the TC process on my input file from the menu (about 2min)
i have also run it successfully using graph (about 2.5 min)
but now , i have tried to run it via gpt (essentially i used python to edit and change the input and output file and create a new xml graph). the xml files look similar but when i run it - it hangs for a long long time (hrs, overnight until i braked it…). i cant see an obvious reason since it works fine as a grap in snap but fails in the gpt. i suspected it had something to do with the indents of the xml and did some trials but no solution so far…

the most annoying thing is that the edited xml (the one edited by python) works just fine if i run it in snap graph interface but not in gpt… it just hangs for hrs and hrs until i break the process… i don’t think its the gpt engin seeing that i have just run 15 previous steps just fine…
very weird…
did anyone had similar problem? any idea how to tackle this?

below i will paste the edited and saved graph (the one that runs in snap) and the graph window…

============================================================

1.0 Read D:\tamir\SAR\output_dir\20190705-20200711srtm1abcbc3abc4abc5IW1_3-3.dim Terrain-Correction SRTM 1Sec HGT 0.0 true BISINC_21_POINT_INTERPOLATION BISINC_21_POINT_INTERPOLATION 13.98 1.255844767199091E-4 GEOGCS["WGS84(DD)", DATUM["WGS84", SPHEROID["WGS84", 6378137.0, 298.257223563]], PRIMEM["Greenwich", 0.0], UNIT["degree", 0.017453292519943295], AXIS["Geodetic longitude", EAST], AXIS["Geodetic latitude", NORTH]] false 0.0 0.0 true false false false false false true false false false false false Use projected local incidence angle from DEM Use projected local incidence angle from DEM Latest Auxiliary File Write D:\tamir\SAR\output_dir\20190705-20200711srtm1abcbc3abc4abc56IW1_3-3.dim BEAM-DIMAP

step6_dtc_2run.xml (3.2 KB)

gnwiii · September 9, 2021, 10:55am

You should check in task manager to see if the system is processing or idling, and how much RAM is in use. Long run times can can be due issues such as a Java memory configuration that results in excessive garbage collection (processor busy, ample free memory) or due to the system waiting for downloaded data that never arrives (internet breakage, server down, URL changed, etc.) in which case the CPU has no work to do.

Java memory for GPT is set in `<snap_install_dir>\bin\gpt.vmoptions" so does not track changes made for the GUI.

ABraun · September 9, 2021, 11:20am

This presents the DEM data from being downloads from its updated source. SNAP 8.0.0 still looks for the old server and gets stuck.

FAQ: A process related to digital elevation models is taking forever to finish

divething · September 9, 2021, 11:22am

hi gnwiii
thanks for your comments. but im not sure that you are correct relevant in this case -
the system is working hard (about 80-30% cpu and 55-42% ram (i have 128g) and the resulting image apears in the output dir within 2-3 minutes. however, the cpu seems to keep on going hard and the console does not reach the >>> end of process stage.

im reminding you that running the same script within snap took 2+ minutes - so i expect more or less - the same. but it has run overnight and - nothing. if i break the process i get an image with nan or 0 as values although the re-projection seems to have taken place…

divething · September 9, 2021, 11:34am

hi abroun
many thanks for your reply
i don’t understand - if you are right - why does it work via snap and not gpt
and also - all the previous stages worked fine (inc those who need srtm like backgeocoding phase to elevation etc), i have run the latter 10 min ago…

i am avoiding the update to 8.0.4 as last week i got terribly stuck with another issue which we were not able to fix any other way (Topographic Phase Removal Error - #7 by divething)

have i missed something?

gnwiii · September 9, 2021, 11:45am

The output image file is created before it gets filled with data. Java does calculations “on demand”, so the appearance of a file does not guarantee that the contents have been generated and is not useful as an indication of the run time for a task.

Have you checked that the memory settings used by the GUI match those in gpt.vmoptions?

divething · September 9, 2021, 12:13pm

hi gnwiii
you are right about the image beeing created and than data written in it

i looked at the memory - probably not the issue…

thats what that said…

###################
Enter one VM parameter per line
For example, to adjust the maximum memory usage to 512 MB, uncomment the following line:
-Xmx512m
To include another file, uncomment the following line:
-include-options [path to other .vmoption file]
-Xmx89G
########################

i tend to think that it is an issue as abroun suggested

but -i don’t understand - it worked same xml code run as a graph in-within snap. so why won’t it work in gpt? is it not the same?

divething · September 9, 2021, 12:32pm

FAQ: A process related to digital elevation models is taking forever to finish

regarding this

it is very important set of advice recommend for everyone to go through that too

but:
for solution 1 - thats what i am doing . so its not that.
i edited the auxdata.properties as described in solution 2.
i cleared all data from the dem folder (both srtm 1 and 3sec)
i have run the graph again via snap - no problem… (was much slower - 3.3 min…)

anyway - the same script-graph works perfectly well in snap but not in gpt…

bth - following the links in the dtm file - the srtm1sec is working but the srtm 3sec address gives an error.

ABraun · September 9, 2021, 1:14pm

that is strange… @lveci Do you have an idea why the DEM download still struggles with gpt after updating SNAP?

divething · September 9, 2021, 1:44pm

hi abroun
thanks for forwarding this issue

i will just add that the update described and undertaken
(as per FAQ: A process related to digital elevation models is taking forever to finish )

did not make a difference - it was running fine in snap graphs before as well as after

what puzzles me is that it works fine in snap and not in gpt while i thought snap runs things via gpt too so there should be no difference…

many thanks to all who responded
tamir

gnwiii · September 9, 2021, 6:09pm

The total RAM usage shown for PyCharm and SNAP is less than 20G. Even with 128G RAM you can’t have both SNAP and GPT using 89G. What is Xmx for SNAP? Does gpt run faster when SNAP isn’t running?

Looks like too many lines in gpt.vmoptions were uncommented. Try deletiing all but the-Xmx89G line. Is there an Xmx setting in <snap_install_dir>/etc/snap.conf>? Mine has '-J-Xmx11G on a laptop with 16GB RAM.

divething · September 11, 2021, 6:35pm

hi gnwiii

thanks for your reply…

ok so firstly - you are right about then missing commenting # only the relevant lines are active.
snap allocated memory is 89g and although you say it can’t work etc etc - i want to remind you it works perfectly in snap - so why not in gpt

the process completed eventually but it took days …
i cant believe its a memory problem but i am going to rerun the entire code and delete attributes that are created along the way… see if that helps

gnwiii · September 11, 2021, 7:13pm

If SNAP has reserved 89 of 128G when you start GPT, GPT will use virtual memory, which can make garbage collection very slow (minutes rather than seconds). The accounting gets complicated because modern systems can use compression for objects in RAM. Have you tried running GPT when SNAP is not running?

See mulitple Java applications with total heap size set to more than physical RAM

The fact that the process completes eventually points to thrashing rather than waiting for some data from the internet.

divething · September 13, 2021, 2:23pm

hi gnwiii

ok so i usually don’t run the snap - only the gpt via python.
so to check your point i will reduce my gpt allocation to 40g and leave snap closed…

what i see is that the process might be finished (because i can see the resulting image in snap) but python seems to think it is still running… is the a way to report back from the gpt top python? might be good to get also the errors when things dont work…
and in any case - its slow.
so your comment about thrashing - sounds right in a way… what can i do to fix that (or check if your correct).

in any case - many thanks
tamir

gnwiii · September 13, 2021, 3:02pm

Thrashing is the likely cause if you have excessive run times and high CPU usage. You can add garbage collection log options to the gpt.vmoptions file (note that the link mentions changes for Java 9 – ESA SNAP is using Java 8). It has been years since I last investigated GC problems (and not on Windows), so just hoping this still works.

divething · September 16, 2021, 6:02pm

hi gnwiii
firstly - many thanks. its driving me crazy this thing…

so here is what we know so far.
1 the problem is only in the dopler terrain correction - the rest of the steps are ok. and i remind you - working within snap - its working fine 2-3 min and its done - its now running 30hrs (gpt) and not finished…
2 memory allocation is 80 something to snap, 50 to got (tot running in the same time anyway) and pycharm with only 20.
3 it seems to be working coz its still creating the image ans some of the layers - are ok. some are not so i was suspecting that it is in fact some stupid glitch in the way gpt runs on this machine caused by some garbage collection problem (to that end ill check your suggestion in the link)
4 do you think i should try linux? i was going to run it all in linux (server) anyway as soon as everything worked…
if anyone out there as a good suggestion - i really really would like your point of view…
my idea is to run the code on some 30 pairs so i really need this solved - pleas anyone…

all the best everyone
tamir

gnwiii · September 16, 2021, 7:49pm

Linux memory management is quite different from Windows. I generally start with linux or macOS and then try to move workflows to Windows. Most workflows need tweaks to run in Windows, some have so many issues that I can’t afford the time and effort needed to work through them all.

There has been considerable work on Java garbage collection. In the past, using a different JVM for gpt has made dramatic improvements in thruput. Linux gpt is a shell script, and starts with:

#!/bin/sh

# Uncomment the following line to override the JVM search sequence
# INSTALL4J_JAVA_HOME_OVERRIDE=
# Uncomment the following line to add additional VM parameters
# INSTALL4J_ADD_VM_PARAMS=

This make it easy to experiment with different Java versions. Most linux distros offer several versions of OpenJDK runtimes. If you are not comfortable with linux command-line processing I suggest investing a few hours a day for several days just practicing command-line basics from one of the many good tutorials until you feel comfortable. You will also want to become familiar with more advanced documents for your distro so you don’t fall into traps laid by some online linux “support” sites.

divething · September 17, 2021, 8:23am

hi gnwiii

the dtc process has finished:
Finished process in 133819.80252432823 seconds.
it too 3 min in snap…

i suggest maybe moving to a more private conversation for this issue because its moved from the original problem (that still exists) to different issue… ABraun - do you have any idea how to progress with this?

i want to remind you that even if i work via linux - i still work in a python environment so i am not sure the bash command you suggest are still relevant. if they are - can you direct me to some more explanation on what this does (eg - does this affects only the gpt use or do i need everything to be running in bash?)… im not a novice but far from being a techie…

for now, although i started to write the same workflow in bash (im not a complete novice but i needed a lot of help) i am a bit reluctant to invest a lot more time (again) in completing it. although i generally agree fully bash would be much better… the python code is virtually ready and except for the last two steps (dtc and the calibration) it works flawless. most importantly, in python its much easier to get help in the scripting…

also personally, although time invested in learning bash is definitely worth while, i am running out of time on this project so restarting and spending more time on this may cause ‘putting it on the shelf’ again for few months or letting it go and i have put too much time already… this project is already overdue… sounds silly to give up now…

so the real question is:
would it make any difference running the python code in linux?
why is the problem occur only in dopler terrain correction (dtc) ??? there are other components that use the same srtm search so i don’t understand why im getting this hell…

anyway again - million thanks for the time you spend helping me out

all the best
tamir

ABraun · September 17, 2021, 10:17am

I cannot tell why it is like this, sorry. But I would like to wait for a response by the developers.

divething · September 17, 2021, 5:50pm

many thanks
i also would like to get the developers response…
all the best
tamir