Performance of snap desktop and snappy

Hi,

I have noticed a huge difference between snap desktop and snappy in terms of processing time.
How can I improve the processing using snappy?
Can I configure snappy with the same options that I have in snap desktop?


If I can so, could you show me an example?

Thank you in advance.

Would you like to use a certain function? I implemented some S2TBX and Raster-Function in Python-Scripts. The implementation can be different. You can find some useful links in this post:

and within the snappy installation directory are some examples available.

Cheers,
Andreas

No, these settings are not helpful to improve the performance of python code.
Please not that python, by default, is not the fastest language.
But there are ways to improve the speed.
For example you can use numpy, especially when working with arrays.
Also numexpr by be worth to have a look at.

Just to give you an indication of what is possible. Recently we improved the performance of one algorithm from taking 3:40 h to 5 minutes. This was mainly achieved by using numpy.

Andreas I already know how to use snappy, I’m just talk about how to improve its performace.

Marpet, I already know numpy and I use it a lot. It’s a good tip.
But I use it for my own fuctions. My doubt is about the snap functions.
I think I have to be more specific.

Now I’m processing a set of 27 Sentinel 1 images using the following functions:
For the Coregistration:

  • TOPSAR-Split
  • Apply-Orbit-File
  • Back-Geocoding
    For the interferometric processing:
  • Interferogram
  • TOPSAR-Deburst
  • TopoPhaseRemoval

The performaces of these functions in SNAP desktop are much better than in Snappy.
I guess I can do something changing jpyconfig.py or snappy.ini files. Am I right? Or there is nothing to do.

1 Like

Yes, maybe it helps if you set jvm_maxmem in jpyconfig.

It might be slower because the parallelization in python is not as good as in Java. And there is no possible way to tune it.

I have already changed that parameter.

It’s a pity!

Thank you!, marpet :+1:

I am very new to this, but is snappy not just interfacing the SNAP java functions?

Yes, exactly.
Your python code is run with the python interpreter. You can call java snap methods from python but you are still in the python environment.

You can improve performance and use parallelisation by using appropriate libraries, like numexpr

I have been looking into the SNAP configuration options and have tried setting snap.parallelism to the number for CPUs. This seems to be set (by default) in the “snap.properties” file from the GUI. I have tried copying the settings over to the “snappy.properties” but it doesn’t seem to have any effect.

This seems like a significant problem with process automation. Is the python interface to SNAP really restricted to a single-core as you suggest? If so, are there any plans to work around this, or is there another approach that I could be using?

Any news about that issue?

There is nothing we can do about. It is Python.
See the ‘Concurrency in Python’ section in this article.

The called java code should actually be executed multithreaded. at least for the GPF calls and the execution of operators. The number of threads is determined by the number of cores available.

@juanesburgo have you followed this thread?


And just to be sure, have you removed the # in front of the java_max_mem setting?

Would it be possible to use the dask package to solve this issue?
I have no experience with these kind of things and don’t know how to apply this to snappy functions, since they are basically a black box for me

Any advanced Python user around who can help?

(PS: very good introduction to dask on YouTube by Jim Crist of Continuum Analytics)

I am processing Sentinel-1 images…
In the end I want to save it as GeoTIFF-BigTIFF (14 GB size of an image).
When I do it in SNAP it takes just 5-10 min to export, but when I use python and snappy and call

ProductIO.writeProduct(x, y, ‘GeoTIFF-BigTIFF’)

it takes up to 2 hours to create it.
Is there a way to solve this and speed up?

There could be two reasons. One is that you have worked with the data in the desktop already and the data is already computed when you write the data. But I guess this is not the full story.

Instead of ProductIO.writeProduct() you can try the Write operator with GPF.createProduct().
Specify the file and the formatName as parameter.

I tried Write, but it didn’t write any file with the path set in parameters, it just carried on. This is part of the code. Did I make some mistake?

…
target_2 = GPF.createProduct(“Terrain-Correction”, parameters, target_1)
terrain=“D:\mozaici\”+ date + “corrected”
parameters = HashMap()
parameters.put(‘file’,terrain)
parameters.put(‘formatName’, ‘GeoTIFF-BigTIFF’)
GPF.createProduct(“Write”,parameters,target_2)

I forgot, you need to trigger the computation.
Do it like this.

targetProduct = GPF.createProduct('Write', parameters, target_2)
targetProduct.getBandAt(0).getGeophysicalImage().getData()

Because GPF follows the pull model, operators are only executed if the data is pulled.
A bit less weird is the following:

WriteOp = jpy.get_type('org.esa.snap.core.gpf.common.WriteOp')
WriteOp writeOp = WriteOp(target_2, terrain, 'GeoTIFF-BigTIFF');
writeOp.writeProduct(ProgressMonitor.NULL);

First option was taking some time to finish, but in the end image file was not created.

Option two is some mix of java syntax, right? In the second line if I neglect WriteOp (declaration I guess) I get

writeOp = WriteOp(target_2, terrain, ‘GeoTIFF-BigTIFF’);
RuntimeError: no matching Java method overloads found

It must be done like this.

WriteOp = jpy.get_type('org.esa.snap.core.gpf.common.WriteOp')
writeOp = WriteOp(target_2, File(terrain), 'GeoTIFF-BigTIFF')
writeOp.writeProduct(ProgressMonitor.NULL)

Terrain must be converted from a string path into a java File.

And where does “File” come from? How to do this conversion from string path to java File?

It is already converted in my second example.
You just need to create a new object File with the string as the parameter.
jpy.get_type(…) is not needed because File is defined in snappy by default.

WriteOp = jpy.get_type(‘org.esa.snap.core.gpf.common.WriteOp’)
writeOp = WriteOp(target_2, File(terrain), ‘GeoTIFF-BigTIFF’)
writeOp.writeProduct(ProgressMonitor.NULL)