Which is the recommended way of working with SNAP in a Python context?
I have used the NetBeans-based GUI for the Sentinel Toolboxes a couple of times now. While it is nice, it does have some bugs and deficiencies. I would like to integrate the data retrieval (downloading of products) and preparation (extracting specific bands from a specific subregion that is part of a product) in Python in order to hopefully create an example for a cloud service that relies on data from Sentinel. So far I have explored the following two options:
-
snappy
- the Python API that comes with the Python setup for SNAP -
GPT
- the command line tool for SNAP
I am using Sentinel 2 Toolbox but my guess is that this functionality is also available for 1 and 3.
For snappy
there appears to be multiple things going on under the hood. For example I can get a subset in two ways:
-
Using JPY (Java-Python bridge):
subset_from_region = snappy.jpy.get_type('org.esa.snap.core.gpf.common.SubsetOp') subset_from_region = subset_from_region() subset_from_region.setSourceProduct(s2_product) subset_from_region.setCopyMetadata(True) subset_from_region.setGeoRegion(region) subset_from_region = subset_from_region.getTargetProduct()
-
Using GPF:
parameters = snappy.HashMap() parameters.put('copyMetadata', True) parameters.put('geoRegion', region) subset_from_region = snappy.GPF.createProduct('subset', parameters, s2_product)
If I go with GPT
I need to call a sub-process to run the command like this:
subprocess.Popen(
[gpt_path, '-h', 'Subset'],
stdout=subprocess.PIPE,
universal_newlines=True
).communicate()[0]
The latter is not exactly Python-ish per se and it involves managing multiple processes. The reason why I even decided to check GPT out was due to the lack of support for snappy
for more recent versions of Python. It becomes increasingly difficult to work with it due to the support for other popular libraries being dropped for older versions of Python over time. Right now I am working on creating a separate Docker container just for snappy
since other dependencies I am using to process the extracted images for a specific band at a later stage require more current versions of the Python interpreter. However I feel that this will fail at some point (probably quite soon). I also tried Miniconda by splitting my workflow into two - data acquisition (uses sentinelsat
to retrieve a product given user-defined parameters including polygon describing the location), data preparation (uses snappy
to process the acquired product producing images for specific bands) and data application (e.g. machine learning algorithm that uses the normal image data for enhancement, segmentation etc.).
I am looking for a more future-proof, consistent approach that involves minimal dependency management and is easy to setup.