I have noticed a huge difference between snap desktop and snappy in terms of processing time.
How can I improve the processing using snappy?
Can I configure snappy with the same options that I have in snap desktop?
Would you like to use a certain function? I implemented some S2TBX and Raster-Function in Python-Scripts. The implementation can be different. You can find some useful links in this post:
and within the snappy installation directory are some examples available.
No, these settings are not helpful to improve the performance of python code.
Please not that python, by default, is not the fastest language.
But there are ways to improve the speed.
For example you can use numpy, especially when working with arrays.
Also numexpr by be worth to have a look at.
Just to give you an indication of what is possible. Recently we improved the performance of one algorithm from taking 3:40 h to 5 minutes. This was mainly achieved by using numpy.
Andreas I already know how to use snappy, I’m just talk about how to improve its performace.
Marpet, I already know numpy and I use it a lot. It’s a good tip.
But I use it for my own fuctions. My doubt is about the snap functions.
I think I have to be more specific.
Now I’m processing a set of 27 Sentinel 1 images using the following functions:
For the Coregistration:
TOPSAR-Split
Apply-Orbit-File
Back-Geocoding
For the interferometric processing:
Interferogram
TOPSAR-Deburst
TopoPhaseRemoval
The performaces of these functions in SNAP desktop are much better than in Snappy.
I guess I can do something changing jpyconfig.py or snappy.ini files. Am I right? Or there is nothing to do.
Yes, exactly.
Your python code is run with the python interpreter. You can call java snap methods from python but you are still in the python environment.
You can improve performance and use parallelisation by using appropriate libraries, like numexpr
I have been looking into the SNAP configuration options and have tried setting snap.parallelism to the number for CPUs. This seems to be set (by default) in the “snap.properties” file from the GUI. I have tried copying the settings over to the “snappy.properties” but it doesn’t seem to have any effect.
This seems like a significant problem with process automation. Is the python interface to SNAP really restricted to a single-core as you suggest? If so, are there any plans to work around this, or is there another approach that I could be using?
There is nothing we can do about. It is Python.
See the ‘Concurrency in Python’ section in this article.
The called java code should actually be executed multithreaded. at least for the GPF calls and the execution of operators. The number of threads is determined by the number of cores available.
Would it be possible to use the dask package to solve this issue?
I have no experience with these kind of things and don’t know how to apply this to snappy functions, since they are basically a black box for me
Any advanced Python user around who can help?
(PS: very good introduction to dask on YouTube by Jim Crist of Continuum Analytics)
I am processing Sentinel-1 images…
In the end I want to save it as GeoTIFF-BigTIFF (14 GB size of an image).
When I do it in SNAP it takes just 5-10 min to export, but when I use python and snappy and call
ProductIO.writeProduct(x, y, ‘GeoTIFF-BigTIFF’)
it takes up to 2 hours to create it.
Is there a way to solve this and speed up?
There could be two reasons. One is that you have worked with the data in the desktop already and the data is already computed when you write the data. But I guess this is not the full story.
Instead of ProductIO.writeProduct() you can try the Write operator with GPF.createProduct().
Specify the file and the formatName as parameter.
It is already converted in my second example.
You just need to create a new object File with the string as the parameter.
jpy.get_type(…) is not needed because File is defined in snappy by default.