Using StatisticsOp operator with Python

geoagr2003 · October 30, 2018, 8:11am

Hi,

I’m trying to do some zonal statistics over the cloud confidence band from Sentinel-2 by using the StatisticsOp operator. I have the following code:

class CheckClouds:

def __init__(self, sat_img_path):
    self.sat_img_path = sat_img_path
    
def read_cloud_band(self):
    ''' Read the cloud confidence band. '''
    product = ProductIO.readProduct(self.sat_img_path)
    band_names = product.getBandNames()
    # print (list(band_names))
    b_qcc = product.getBand('quality_cloud_confidence')
    # print (b_qcc)
    
    HashMap = jpy.get_type('java.util.HashMap')

    BandConfiguration = jpy.get_type('org.esa.snap.statistics.BandConfiguration')
    bandConfiguration = BandConfiguration()
    # print(dir(bandConfiguration))
    
    bandConfiguration.sourceBandName = 'quality_cloud_confidence'
   
    bandConfigurations = jpy.array('org.esa.snap.statistics.BandConfiguration', 1) 
    bandConfigurations[0] = bandConfiguration
    
    parameters = HashMap()

    parameters.put('bandConfigurations', bandConfigurations)
    parameters.put('sourceProductPaths', str(self.sat_img_path))
    parameters.put('shapefile', '..\\..\\config\\test.shp')
    parameters.put('outputAsciiFile', 'D:\\test_file.csv') 
    # parameters.put('writeDataTypesSeparately', True) 
    
    result = GPF.createProduct('StatisticsOp', parameters, product) 
    
    return result

new_sat_imag = CheckClouds(‘D:\sentinel_2_img_dir\S2B_MSIL2A_20180304_img_name.zip’)

new_sat_imag.read_cloud_band()

I have no output (no file or product is generated) and no error after running the script. Does someone have an idea about what I’m doing wrong ? Is the parameter setup ok for this operator ?

Is it always necessary to use ProductIO.writeProduct how can I get only the values from the exported csv file ?

Regards,

George

geoagr2003 · October 30, 2018, 9:15pm

Somehow I figure it out how to use the StatisticsOp operator, but … there are some issues I believe or I miss something.

If I use the raw image format like ‘S2B_MSIL2A_20180304_img_name.zip’ (downloaded from sentinel 2 data server) the script can’t process the ‘quality_cloud_confidence’ band (I believe is not a physical band). The band is considered only if I create a subset from the raw image, in this way in the subset there is a physical quality_cloud_confidence.img file (in the subset there are *.img files for all existing bands listed by ). If I use a band name like ‘B4’ the script goes ok in both situations.

Does anyone knows something about this issue(s) ?

geoagr2003 · November 5, 2018, 1:38am

My loop stops unexpected without error, it doesn’t loop through all files from the folder. Does someone know if it can be fixed?

from snappy import ProductIO, GPF, jpy

import os, glob

class CheckClouds:

def __init__(self, sat_img_path):
    self.sat_img_path = sat_img_path
    
def read_cloud_band(self):
    
    S2CacheUtils = jpy.get_type('org.esa.s2tbx.dataio.cache.S2CacheUtils')
    
    ''' Read the cloud confidence band. '''
    product = ProductIO.readProduct(self.sat_img_path)
    band_names = product.getBandNames()
    name = product.getName()
    # print (name)
    # print (list(band_names))
    # b_qcc = product.getBand('quality_cloud_confidence')
    # print (b_qcc) 

    # Configuring operator
    # Build parameters config for StatisticsOp
    BandConfiguration = jpy.get_type('org.esa.snap.statistics.BandConfiguration')
    bandConf_1 = BandConfiguration()
    # print(dir(bandConf_1))
    
    bandConf_1.sourceBandName = 'quality_cloud_confidence'
    # print (bandConf_1.sourceBandName)
   
    bandConfigurations = jpy.array(BandConfiguration, 1)
    # print (dir(bandConfigurations))
    
    bandConfigurations[0] = bandConf_1
    
    HashMap = jpy.get_type('java.util.HashMap')
    parameters = HashMap()

    parameters.put('bandConfigurations', bandConfigurations)
    parameters.put('sourceProductPaths', self.sat_img_path)
    parameters.put('shapefile', '..\\..\\config\\roi_area.shp')
    parameters.put('outputAsciiFile', '..\\..\\data_output\\quality_check\\' + name +'_test_file.csv') 
    parameters.put('writeDataTypesSeparately', True) 
    # print (parameters)
    
    result = GPF.createProduct('StatisticsOp', parameters, product) 

    # print (result)
    
    print("\nWriting...")

    ProductIO.writeProduct(result, str('..\\..\\data_output\\quality_check\\' + name +'_qa.dim'), 'BEAM-DIMAP')
    # ProductIO.writeProduct(result, '', '')

    # print (dir(result))
    
    print ("\nDone!\n")
    
    product.dispose()
    S2CacheUtils.deleteCache()

file_list = glob.glob(‘…/…/data_output/subsets/*.dim’)

for file_name in file_list:
print (file_name)
CheckClouds(file_name).read_cloud_band()

marpet · November 7, 2018, 9:46am

The quality_cloud_classifiaction is a physical band.
It is located in the QI_DATA folder, not in the IMG_DATA. It is named L2A___CLD_60m.jp2.
It is not available in 10 meters.
If the band is present after a subset it must exist beforehand.

Yes, the CSV data can only be accessed via the written file.

I don’t know why your loop stops. Are the other products processed after the first chunk is done and you start the processing again?

geoagr2003 · November 7, 2018, 10:13am

Regarding the loop stop issue…
The way that I wrote the script is like this: I generated a list of file paths and loop through it by instantiating the class for each *.dim file from the list of file paths.

marpet · November 7, 2018, 10:49am

If you skip over the already processed ones is it failing at the same product.
You could move the already processed data some where else. Maybe one is corrupted?
Or does it stop at different products?

geoagr2003 · November 7, 2018, 8:10pm

It stops at different product. I don’t have problems with the products, I use them to calculate various indexes.
I reversed the loop etc. … Usually it runs for about 20 products and stops unexpected with no error message.

I have to mention that I have to loop through 47 *.dim files. After splitting images in 4 different folders the script worked as it should. I believe that some used operator or other snappy instantiated object is overloading the memory.

geoagr2003 · November 13, 2018, 9:44am

Can be ‘parameters’ written as a nested dictionary in this version? In this way it could be possible to limit class initialization (object creation).

marpet · November 14, 2018, 4:46pm

I don’t think so.
If I remember correctly there is no conversion implemented from Python dictionary to Java (Hash)Map.