GPT coherence graph processing never ends


#1

Dear all,

The graph used : TOPSAR Coreg Interferogram IW All Swaths_test5.xml (13.2 KB)

The problem:
When I try to process it on SNAP GUI : it is running in less than 6 hours.

When I try to process it with GPT - command line to “bash process” manually : even after more than 8 days it’s still not finished and not writing in the output file.

Hardware/software used :
Multiple trials on different software/machines resulted in the same problem.
Windows 64 bits, server 2008, 128 Go RAM, no HDD limit.
Linux 64 bits, with 64 Go RAM, no HDD limit.
Linux 64 bits, with 512 Go RAM, no HDD limit.

Does someone had the same troubles ?

Kind regards,

Guillaume


#2

First, make sure you have all the latest updates. In update 5.0.2 or 3 we introduced a new option for gpt to provide some diagnostic information -diag
It will help you check if the memory settings you have applied are actually taking affect on gpt.

You may also try to break the graph down into multiple steps. For example coregister in one graph and terrain correct in a separate graph. Depending on the processing involved, sometimes it performs better.


#3

It’s just running, not giving anything. On Linux machine I can’t update from 5.0.0 to 5.0.2 ( see : SNAP update error)

I’ll try again on updated windows machine. But it will take a few days to eventually never end.

Current log after 8 days run :

WARNING: org.esa.snap.core.util.ServiceFinder: Can’t search for SPIs, not a directory: XXX/.snap/snap-python
INFO: org.esa.snap.core.gpf.operators.tooladapter.ToolAdapterIO: Loading external tools…
INFO: org.esa.snap.core.gpf.operators.tooladapter.ToolAdapterIO: Scanning for external tools adapters: XXX/snap/java
SEVERE: org.esa.snap.core.gpf.operators.tooladapter.ToolAdapterIO: Failed scan for Tools descriptors: I/O problem: XXX/snap/java
INFO: org.esa.snap.core.dataop.dem.ElevationFile: http retrieving http://step.esa.int/auxdata/dem/SRTMGL1/N51E003.SRTMGL1.hgt.zip
INFO: org.esa.snap.core.dataop.dem.ElevationFile: http retrieving http://step.esa.int/auxdata/dem/SRTMGL1/N51E004.SRTMGL1.hgt.zip
INFO: org.esa.snap.core.dataop.dem.ElevationFile: http retrieving http://step.esa.int/auxdata/dem/SRTMGL1/N51E002.SRTMGL1.hgt.zip
INFO: org.esa.snap.core.dataop.dem.ElevationFile: http retrieving http://step.esa.int/auxdata/dem/SRTMGL1/N51E005.SRTMGL1.hgt.zip
INFO: org.esa.snap.core.dataop.dem.ElevationFile: http retrieving http://step.esa.int/auxdata/dem/SRTMGL1/N51E001.SRTMGL1.hgt.zip
INFO: org.esa.snap.core.dataop.dem.ElevationFile: http retrieving http://step.esa.int/auxdata/dem/SRTMGL1/N50E001.SRTMGL1.hgt.zip
INFO: org.esa.snap.core.dataop.dem.ElevationFile: http retrieving http://step.esa.int/auxdata/dem/SRTMGL1/N50E002.SRTMGL1.hgt.zip
INFO: org.esa.snap.core.dataop.dem.ElevationFile: http retrieving http://step.esa.int/auxdata/dem/SRTMGL1/N50E003.SRTMGL1.hgt.zip
INFO: org.esa.snap.core.dataop.dem.ElevationFile: http retrieving http://step.esa.int/auxdata/dem/SRTMGL1/N50E004.SRTMGL1.hgt.zip
INFO: org.esa.snap.core.dataop.dem.ElevationFile: http retrieving http://step.esa.int/auxdata/dem/SRTMGL1/N50E005.SRTMGL1.hgt.zip
slurmstepd: *** JOB 2365308 CANCELLED AT 2017-03-29T17:48:38 DUE TO TIME LIMIT on node015 ***
– org.jblas INFO Deleting /tmp/jblas6817606778993571186/libjblas_arch_flavor.so
– org.jblas INFO Deleting /tmp/jblas6817606778993571186/libjblas.so
– org.jblas INFO Deleting /tmp/jblas6817606778993571186


#4

I’ve posted a link in the other thread to an issue I’ve found. Maybe it can help.


#5

Now it’s updated (today version check OK), I tried the --diag option. it just stops the processing before starting, writing in the log the followings :

INFO: org.esa.snap.core.gpf.operators.tooladapter.ToolAdapterIO: Initializing external tool adapters
SEVERE: org.esa.s2tbx.dataio.gdal.GDALInstaller: The GDAL library is available only on Windows operation system.
SNAP Release version 5.0
SNAP home: PATH/…
SNAP debug: null
SNAP log level: null
Java home: PATH/snap/jre
Java version: 1.8.0_102
Processors: 48
Max memory: 1 GB
Cache size: 1024 MB
Tile parallelism: 48
Tile size: 512 x 512 pixels

To configure your gpt memory usage:
Edit snap/bin/gpt.vmoptions

To configure your gpt cache size and parallelism:
Edit .snap/etc/snap.properties or gpt -c ${cachesize-in-GB}G -q ${parallelism}


#6

What’s strange with your configuration is that it shows that you only have 1GB of memory assigned to gpt.
Maybe almost 2GB, but because of wrong scaling the value is always scaled to the lower integer value (fixed with the next update).
So I think there is something wrong with your configuration. By default the value should be higher on your machine.
Check the gpt.vmoptions


#7

The diag output after manually editing the gpt.vmoption to fit in most of the nodes :

INFO: org.esa.snap.core.gpf.operators.tooladapter.ToolAdapterIO: Initializing external tool adapters
SEVERE: org.esa.s2tbx.dataio.gdal.GDALInstaller: The GDAL library is available only on Windows operation system.
SNAP Release version 5.0
SNAP home: PATH/snap/bin//…
SNAP debug: null
SNAP log level: null
Java home: PATH/snap/jre
Java version: 1.8.0_102
Processors: 48
Max memory: 26 GB
Cache size: 1024 MB
Tile parallelism: 48
Tile size: 512 x 512 pixels

To configure your gpt memory usage:
Edit snap/bin/gpt.vmoptions

To configure your gpt cache size and parallelism:
Edit .snap/etc/snap.properties or gpt -c ${cachesize-in-GB}G -q ${parallelism}

And the log after 2 days of processing, still very slow compared to GUI on Windows :

INFO: org.esa.snap.core.gpf.operators.tooladapter.ToolAdapterIO: Initializing external tool adapters
SEVERE: org.esa.s2tbx.dataio.gdal.GDALInstaller: The GDAL library is available only on Windows operation system.
Executing processing graph
INFO: org.esa.snap.core.dataop.dem.ElevationFile: http retrieving http://step.esa.int/auxdata/dem/SRTMGL1/N52E003.SRTMGL1.hgt.zip
INFO: org.esa.snap.core.dataop.dem.ElevationFile: http retrieving http://step.esa.int/auxdata/dem/SRTMGL1/N52E002.SRTMGL1.hgt.zip
WARNING: org.esa.snap.core.dataop.dem.ElevationFile: http error:http://step.esa.int/auxdata/dem/SRTMGL1/N52E002.SRTMGL1.hgt.zip on http://step.esa.int/auxdata/dem/SRTMGL1/N52E002.SRTMGL1.hgt.zip
WARNING: org.esa.snap.core.dataop.dem.ElevationFile: http error:http://step.esa.int/auxdata/dem/SRTMGL1/N52E003.SRTMGL1.hgt.zip on http://step.esa.int/auxdata/dem/SRTMGL1/N52E003.SRTMGL1.hgt.zip
…10%…20%…30%…slurmstepd: error: *** JOB 917106 CANCELLED AT 2017-04-26T13:59:42 DUE TO TIME LIMIT on server ***

So I added "-J-Xmx512000M " and -c 65536M to my gpt command :

Processors: 48
Max memory: 444 GB
Cache size: 64 GB
Tile parallelism: 48
Tile size: 512 x 512 pixels

Process ran in roughly 3h30min. Thank you a lot !