Poor v8.0 Performance

A basic Sentinel-1 SLC to terrain corrected Gamma0 requires more than twice the time/resources on SNAP v8.0.0.
I tested both gpt and snap for versions 7.0.2 and 8.0.0 on CentOS 7.

The gpt results are representative, in minutes:
v8.0.0: 25 real, 449 user, 28 sys. Peak CPU utilization ~3500%
v7.0.2: 9 real, 120 user, 10 sys. Peak CPU utilization ~2500%

1 Like

DId you verify that Java memory settings were the same on both systems? You might run the test again with a performance monitor (htop or bpytop) running to see if there are differences in memory usage.

The graph here is not ideal. Rather than having subset at the end, a TOP Split early on would be better. Also the terrain flattening and terrain correction may perform better if done in two steps.


I am having the same problem. I upgraded to Ver8.0, then the program processes images very low. It also takes time to open an RGB image window.
I uninstalled the Ver8.0 and installed Ver7 again. However, the SNAP program seems does not work as it was before :frowning:

Is there anyone knows how to fix this issue?

If something broke you need to do a clean reinstall and delete all user data.

If you adjusted the Java memory settings for Ver7 you should check that the memory settings weren’t reverted to default values.

I have tested a number of different settings with both versions, and verified that I am comparing like configurations.

my personal impression is that a lot of tasks (importing and displaying images, calculation of textures) are now a lot faster with version 8. But I haven’t compared GCP tasks yet.

Thank you Luis. It is not ideal to process full image intermediate products when only a subset is required. We do not necessarily know which subswaths contain the subscene, but we will look at calculating which one or two are required for each subscene. You have previously mentioned splitting up terrain flattening and terrain correction on my graphs. I will test.
It is unfortunate that we were required to switch our production server back to 7.0.2. It is producing ~3x more products than with 8.0.0.

Please report how much improvement you get from splitting the graph in two and using TOPS-split on SNAP 7.x and then if possible do the same test on 8.0.

Removing terrain flattening from the graph resulted in faster processing time in v8.0.0, with slightly more CPU utilization. Minutes shown:

v8.0.0: 1.5 real, 10.5 user, 3.2 sys. Peak CPU utilization ~3000%
v7.0.2: 1.7 real, 8.3 user, 3.5 sys. Peak CPU utilization ~1000%

1 Like

The subscene used for the first test was 95% on subswath IW3. So I ran the same original graph, adding TOPSAR-Split of IW3, and measured these times:

v8.0.0: 16.2 real, 184.2 user, 25 sys. Peak CPU utilization ~1700%
v7.0.2: 7.1 real, 89.8 user, 7.1 sys. Peak CPU utilization ~1700%

I started testing the graph split in half, but for some reason 7.0.2 never completed the Terrain-Flattening half. I will post additional results soon.

Same date GRDH to terrain corrected Gamma0 requires more time/resources on SNAP v8.0.0

v8.0.0: 13 real, 172 user, 14 sys.
v7.0.2: 7 real, 46 user, 7.4 sys

Putting terrain flattening and terain correction in the same graph is always going to be suboptimal as both deal with the DEM in a different manner, which creates a bottleneck.

Processing SLC IW3 through Terrain-Flattening only:

v8.0.0: 25 real, 1107 user, 157 sys. Peak CPU utilization ~6800%
v7.0.2: 23 real, 419 user, 30 sys. Peak CPU utilization ~3000%

@mengdahl. Performance summary :
TC without TF: v8 is slightly faster, but uses more CPU time than v7.0.2.
Process through TF only, both versions have about the same processing time, but v8 used ~200% more CPU time (above).
Adding TOPSAR-Split to the original graph, saved ~40% cpu time, but resulted in about the same processing time relative to without split. However, v8 is still twice the time and resources.
GRDH processing takes about the same time as SLC, but SLC uses >2x the CPU resources. And v8 >2x v7.
The biggest cost is TF at >10x time and >30x CPU relative to graphs without. v8 doubling of TF processing time/cost is a hit.
@lveci My results do not appear to point to any benefit in splitting up my graph: It might end up costing more. Smaller graphs could be very important on machines with smaller configurations, but with 72 cores, and 768 GB memory we end up running multiple gpt instances.

If I’m not mistaken more CPU utilization is better since it implies that less
time is wasted waiting for I/O. BTW I have trouble understanding your numbers - what is the total processing time aka. wall time?

To clarify for all: real, user, and sys, I reported for this thread are from the Linux/bash time command, which cumulates those resources during execution. For this thread, I rounded all the numbers to minutes.
real is wall clock. user is user CPU time, and sys is system CPU time (spent relative to the process).

I am running “time gpt my.xml” from bash, CentOS 7, command line, using the same Sentinel-1 image, vmoptions, etc. for both SNAP 7.0.2 and 8.0.0. (VH disappeared in 7.0.3, which is why I am using 7.0.2)

More CPU utilization might be better if the execution (wall) time was proportionally shorter. My results show greater execution (wall) time, as well as CPU time, indicating poorer performance for v8.0.0.

It appears the difference is primarily from TF in the two versions. The culprit could be anti-meridian, polar, or other support if added for 8.0.0.


I also notice that the version 8.0 is significantly slower than version 7.0. I try interferogram use same data with all same parameter both in v8 and v7 at same workstation. I use linux “time” to estimate processing time, as the result, v8 use about 79 min to finish interferogram processing, while v7 only use 7 min.

Also, I notice that v8 split write step as a isolated operator while use single operator (like use “gpt Interferogram -t target.dim”) in gpt, I found that the default value of parameter “writeEntireTileRows” in write operator is “false”, it has set as true in v7. I don’t know the connection between the write parameter and processing time, but no doubt the SNAP v8 does a terrible job in efficiency.

Figure below is timing result of version 8.0:

Figure below is timing result of version 7.0:

Your screen dumps are almost unreadable on a laptop screen. It would be much better to cut and paste as text, or attach more complete files. With java code the first thing to check is differences in memory settings. Have you checked the impact of using the same “writeEntireTileRows” setting?

Finally, it is worth noting that SNAP 8 uses the OpenJDK runtime, a result of Oracle’s recent license changes. The OpenJDK effort has put priority on correctness over performance. I’m not sure if they are interested in reports of performance regressions at this stage. If your organization has a paid Java license you might be able to try Oracle Java. There are also high quality Java JDK’s from Redhat and others.

Interesting that v7 is 11x faster, with 1/2 the CPU utilization.

@gnwiii How would you propose switching SNAP to use another java version on linux to test performance difference? Maybe SNAP should not include OpenJDK, so users can select one optimized for their systems?

SNAP 8.0 is distributed with: openjdk version “1.8.0_242”
SNAP 7.0 with: java version “1.8.0_202”

I have 4 other versions on my CentOS 7:
java version “1.8.0_121”
openjdk version “1.8.0_275”
java version “14.0.2” 2020-07-14
openjdk version “15.0.1” 2020-10-20

But none of these other Java installations have jre. So I downloaded jre1.8.0_281 from Oracle and replaced SNAP 8.0 jre. I will post results comparison soon.
mv /opt/snap_8/jre /opt/snap_8/jre_dist
ln -s /opt/jre1.8.0_281/ /opt/snap_8/jre