GPT Benchmark

sanjayrana · August 26, 2016, 9:25am

Hi Everyone,

I thought it might be useful for all of us to share our computation times as benchmarks for the benefit of SNAP team and other users.

So, using a standard gpt run (i.e. without any options to increase cache/tile size etc) , it takes my computer (16GB RAM, 2.93 GHz, 64 Bit Win 10) about 20-27 mins. to do the following steps on a single 1SDV* GRD dataset:

Apply Restituted Orbital File
GRD Border Fix
Thermal Noise Removal
Calibration (sigma nought, beta nought)
Terrain Flattening (SRTM 3 ArcSec)
Speckle Filtering (Gamma, 3x3)
Terrain Correction (SRTM 3 ArcSec)

How long does it usually take for your datasets to process?

reznik · September 8, 2016, 12:14pm

Result of my benchmark:

I’m using snap-desktop! (with GUI) not gpt in terminal (can’t do this for some time).
My operating system is OS X El Capitan 10.11.6.
For test I use IW GRDH (VV-VH) product.

Resources used on my macbook by snap:
~50% / 2.8 GHz Intel Core i7
8.5 / 16 GB 1600 MHz DDR3

Here is graph:

And my result is: 273.98334 minutes (366344 B/s 9158 Pixels/s)

Also, many program are running on my computer (like postgresql client, pycharm, chrome, skype:) ). Can’t do clear experiment at this moment. I will try to get better result next time (with no gui and background apps) .

sanjayrana · September 21, 2016, 7:55pm

The time taken looks rather a lot. I have discovered that there a few simple things that can be done to improve the processing time dramatically. No guarantees as to which ones have any real impact but I have been testing all of them. My processing time has now gone to a minute to 5 mins.

Use faster hard drives either SSD or at least 7200 RPM HDD.
Use lots of RAM typically 128GB+
Do not install any library (e.g. S2, S3 etc) other than S1TBX if all you want is to process S1 data. It keeps the java class numbers in memory to minimal and startup of gpf will be quicker.
Use the cache size (-c) and multithread (-q) options if you have lots of RAM and CPU resources.
Use JVM switches to increase heap size, garbage collection in the gpt.options file.
If you have lots of RAM then a single chain of operations in a single xml file is much quicker than breaking the chain into multiple xmls with intermediate files - the latter has too much i/o overhead
Pre-downloading the Orbital files and DEM files and keeping them in the auxdata folder will also save some amount of time.
Use a 64-bit machine.
I am unsure whether this makes an impact but I have replaced the built-int JRE of SNAP with the latest version of the 64-bit JRE client that I downloaded from oracle separately.

reznik · October 10, 2016, 2:41pm

Can I use subset operator before others?
For example, I have batch graph:

Read
Apply Orbit File
ThermalNoiseRemoval
Calibration
Multilook
Terrain-Correction
LinearToFromdB
Subset
Write
So, if I use Subset to make small image before Calibration, Terrain-Correction, etc will it be good ?

marpet · October 10, 2016, 2:46pm

Actually there should be no difference, except some of the operators are doing something special.
GPF is only computing the data which is necessary. In this case what is needed to write the result to disk.
But you can try it. It should be not much work to change the order.

reznik · October 10, 2016, 3:35pm

Here is some results of my investigation:

Subset is last operator before write:
Read, Apply Orbit File, ThermalNoiseRemoval, Calibration, Multilook, Terrain-Correction , LinearToFromdB , Subset, Write
Subset is first operator after Read:
Read, Subset, Apply Orbit File, ThermalNoiseRemoval, Calibration, Multilook, Terrain-Correction , LinearToFromdB, Write

But I got something interesting. Look at result images:
This is for first graph:

And this is for second graph.

Screen Shot 2016-10-10 at 6.30.43 PM.png541×547 17.9 KB

Here is visual image from sentinel-2 for better understanding images.

Screen Shot 2016-10-10 at 6.35.21 PM.png615×644 63.7 KB

ABraun · October 10, 2016, 5:25pm

thanks for the comparison. The one with the subset at the beginning reduces the data size early thus reaving less data to process for the remaining steps.

Could the colors result from different minimum/maximum values in the color manipulation tag?

mengdahl · October 11, 2016, 1:21pm

You need to make sure to do the comparison on a “fresh” instance of SNAP so that it is certain that none of the data is residing in caches (skewing the results).

Your processing-speed seems rather slow based on the numbers…

mengdahl · October 11, 2016, 1:30pm

It does not really work that way - SNAP has a “pull-processing architecture” so the writer at the end starts asking for the data from the preceding operators. The resident developer wizards like @marpet & @lveci can explain how this actually works and how the order of operators might or might not affect performance.