Compute zonal statistics from Sentinel-2 VI's

plitkey · September 26, 2016, 7:48am

Hello,

I’m trying to first compute some vegetation indices in raster and from them output zonal statistics for a vector layer with multiple polygons. I could save the VI rasters as geotiff and then proceed with GDAL. How can I skip the save and have the georeferencing from the snap product to make a GDAL mem raster or is there a way to clip the VI band from a snap product with shapefile polygon and compute stats for the pixels inside?

-Paula

marpet · September 26, 2016, 9:04am

SNAP and GDAL can exchange data only via the hard drive.

To compute statistics you can use for example the StatisticsOp.
Type gpt StatisticsOp -h on the command line to get the help for it.
Maybe this helps you already.

plitkey · September 26, 2016, 9:39am

Thanks! This helps.

-Paula

plitkey · September 26, 2016, 10:39am

Well, I guess more help would be needed. 1) How to call this in python? 2) The smart way to use this operator? 3) Is it possible to use the VI’s listed in SNAP GUI Optical ->Thematic Land Processing -> Vegetation Radiometric Indices from Python?

I was planning to process a set of products in a chain like this

for-loop files in folder: open-> resample -> subset (using bounding box of the shape file) -> compute VI’s -> compute stats in polygons -> save stats in ASCII. The raster subset would be in memory only, not saved as a product.

What I see from the StatisticsOp, it is ready to loop the statistics for the files/products input. Can it be used in the processing chain I was planning?

Thanks,

-Paula

marpet · September 26, 2016, 2:22pm

The statistics operator is called as any other operator from python. The same for the index operators.
On the command line you can typegpt -h to get a list of all available operators.
You can find examples in the following threads:

snappy-where-to-start?
Example script for multiple operations?

If you invoke the statistics operator only with one product it should work as you expect it.

regards
Marco

plitkey · September 28, 2016, 7:17am

Thanks, again very helpful!

Now I am stuck with giving the statisticsOp the band configurations, I can find something about this in Java, but no example in Python.

The error is “Parameter ‘bandConfigurations’ must be specified”.

I tried to give a list -> cannot convert a Python ‘list’ to a Java ‘java.lang.Object’ I tried a dict also with -> cannot convert a Python ‘dict’ to a Java ‘java.lang.Object’ Please give some more guidance?

-Paula

marpet · September 28, 2016, 9:30am

You need to do something similar to this example. But instead of org.esa.snap.core.gpf.common.BandMathsOp$BandDescriptor you need to use org.esa.snap.statistics.BandConfiguration

plitkey · September 28, 2016, 10:32am

Well, I got lost here, this is what I tried

string
string
string

this was from ./gpt StatisticsOp -h sort of gives the idea that sourceBandName, expression and validPixelExpression would be needed
I adapted the example code to this
BandConfDescriptor = jpy.get_type(‘org.esa.snap.statistics.BandConfiguration’)
bandConf1 = BandConfDescriptor()
#bandConf1.sourceBandName = ‘ndviMasked’
#bandConf1.expression = ‘average’ #>string
#BandConf1.validPixelExpression ./gpt StatisticsOp -h

Had an error (indicated that no defaults were used)

statProduct = GPF.createProduct(‘StatisticsOp’, parameters, ndviProduct)
RuntimeError: org.esa.snap.core.gpf.OperatorException: Configuration must contain either a source band name or an expression.

ok, I’ll give the band name:

    BandConfDescriptor = jpy.get_type('org.esa.snap.statistics.BandConfiguration')
    bandConf1 = BandConfDescriptor()
    bandConf1.sourceBandName = 'ndviMasked'
    #bandConf1.expression = 'average' #>string</expression>
    #BandConf1.validPixelExpression

bandConf1.sourceBandName = 'ndviMasked'

AttributeError: ‘org.esa.snap.statistics.BandConfiguration’ object has no attribute ‘sourceBandName’

    BandConfDescriptor = jpy.get_type('org.esa.snap.statistics.BandConfiguration')
    bandConf1 = BandConfDescriptor()
    print(dir(bandConf1))

[‘class’, ‘delattr’, ‘doc’, ‘eq’, ‘format’, ‘ge’, ‘getattribute’, ‘gt’, ‘hash’, ‘init’, ‘jinit’, ‘le’, ‘lt’, ‘ne’, ‘new’, ‘reduce’, ‘reduce_ex’, ‘repr’, ‘setattr’, ‘sizeof’, ‘str’, ‘subclasshook’, ‘equals’, ‘getClass’, ‘hashCode’, ‘jclass’, ‘notify’, ‘notifyAll’, ‘toString’, ‘wait’]

Where did I go wrong?

-Paula

marpet · September 28, 2016, 11:50am

I’m sorry I haven’t seen that sourceBandName, expression and validPixelExpression are note publicly accessible.
This makes this operator unusable via the API. It can only be used via gpt from the command line. This should be fixed with the next release.

But maybe it is anyway easier for you to use the statistical functionalities which are available in python.
There for you would need the data.
You can get the data by using the readPixels method.
Here is an example:

github.com

senbox-org/snap-engine/blob/6e22431ffafde09948b62e74d96f70091d9a4e77/snap-python/src/main/resources/snappy/examples/snappy_flh.py#L42




targetProduct.writeHeader(String('snappy_flh_output.tif'))


r1 = numpy.zeros(width, dtype=numpy.float32)
r2 = numpy.zeros(width, dtype=numpy.float32)
r3 = numpy.zeros(width, dtype=numpy.float32)


print("Writing...")


for y in range(height):
b1.readPixels(0, y, width, 1, r1)
b2.readPixels(0, y, width, 1, r2)
b2.readPixels(0, y, width, 1, r3)
print("processing line ", y, " of ", height)
FLH = r2 - k * (r1 + a * (r3 - r1))
targetBand.writePixels(0, y, width, 1, FLH)


targetProduct.closeIO()


print("Done.")

Instead of reading the data line wise you can also get the whole image.

marpet · September 28, 2016, 1:16pm

For masking the data with the region from the shapefile you can probably use the Import-Vector operator.
The resulting product will have a new Mask with the name of the shapefile.
This mask can then be used to identify the pixels in the ROI.
For example you can the the valid pixel expression of the band you are working with.

atteggiani · January 24, 2018, 1:45pm

Hello everybody there!

I’m running the StatisticsOp over a product with a selected shapefile as ROI.
Is there a way to store the StatisticOp output values (those inside the ASCII file) directly inside a python object?
My problem is that when the operator creates the ASCII file, most of the values in it are truncated and are not completely exact.
The main issue happens for the max_error value, which is supposed to be something around 10^-6 but, due to the truncament to 4 digits after comma, the value in the ASCII file is 0.0000, so 0.
Is there a way to avoid this behaviour and preserve the real Statistics values?

Thank you!
Davide

gnwiii · January 24, 2018, 7:11pm

It is better to start a new topic so it will be easier for others with the same problem to discover in the future – something like: “StatisticsOp fixed output format loses precision for badly scaled data”.

The same problem occurred using BEAM. A workaround is to define a new variable scaled to suit StasticsOp using band maths.

atteggiani · January 26, 2018, 2:06pm

Thank you for the advice @gnwiii, I’ve started a new topic here StatisticsOp output format loses precision.

I think the problem of your workaround (at least in some circumstances when it might not work ) is that some output values don’t depend directly on the values of the single band. I think that this is the case of the max_value I mentioned above, which I think is the maximum error of the statistics and will not change if I simply scale the band (for example by doing [band*10000]).
Correct me if I’m wrong…

geoagr2003 · October 24, 2018, 9:37pm

Hi Marco,

I’m trying to do some zonal statistics over the cloud confidence band by using the StatisticsOp operator. I have the following code:

class CheckClouds:

def __init__(self, sat_img_path):
    self.sat_img_path = sat_img_path
    
def read_cloud_band(self):
    ''' Read the cloud confidence band. '''
    product = ProductIO.readProduct(self.sat_img_path)
    band_names = product.getBandNames()
    # print (list(band_names))
    b_qcc = product.getBand('quality_cloud_confidence')
    # print (b_qcc)
    
    HashMap = jpy.get_type('java.util.HashMap')

    BandConfiguration = jpy.get_type('org.esa.snap.statistics.BandConfiguration')
    bandConfiguration = BandConfiguration()
    # print(dir(bandConfiguration))
    
    bandConfiguration.sourceBandName = 'quality_cloud_confidence'
   
    bandConfigurations = jpy.array('org.esa.snap.statistics.BandConfiguration', 1) 
    bandConfigurations[0] = bandConfiguration
    
    parameters = HashMap()

    parameters.put('bandConfigurations', bandConfigurations)
    parameters.put('sourceProductPaths', str(self.sat_img_path))
    parameters.put('shapefile', '..\\..\\config\\test.shp')
    parameters.put('outputAsciiFile', 'D:\\test_file.csv') 
    # parameters.put('writeDataTypesSeparately', True) 
    
    result = GPF.createProduct('StatisticsOp', parameters, product) 
    
    return result

new_sat_imag = CheckClouds(‘D:\sentinel_2_img_dir\S2B_MSIL2A_20180304_img_name.zip’)

new_sat_imag.read_cloud_band()

I have no output. Do you have an idea about what I’m doing wrong ? Is the parameters setup ok ?
Do you have any info if the previous mentioned issue was fixed ?