Draw histogram and save it as an image - Python script using ESA SNAP, not matplotlib

kedziorm · November 28, 2016, 11:57am

ESA STEP Desktop provides ability to draw a histogram and output basic descriptive statistics for the selected band in currently opened product.

I would like to do the same from Python script - I would like to save histogram to an image and output statistics to text file.

I can use GDAL library, Gdal Dataset.ReadAsArray() to obtain array of values and plot them using matplotlib library.

However, if possible, I would like to use ESA SNAP API.
How can I do that? Could you provide any samples?
Are there any possibilities in SNAP API to compare two products (in terms of descriptive statistics)?

antonio19812 · November 28, 2016, 5:06pm

Do you already know the snappy module?

kedziorm · November 28, 2016, 5:39pm

Yes, I use snappy, but I am not aware of methods/functions from snappy for drawing histograms or basic descriptive statistics calculations.

kedziorm · November 29, 2016, 1:00pm

@antonio19812 Could you provide any sample code for creating histogram using snappy module, please?

marpet · December 1, 2016, 7:58am

Unfortunately we don’t have a good reusable API for this.
We have it on our agenda for next year to make this and other things better usable via API and also via the command line.
So matplotlib is currently the best option to create such charts within Python.
If you really want to give it a try, you can look at the class HistogramPanel.
There we use JFreeChart for creating the charts. But I think it is not worth the effort.

kedziorm · December 1, 2016, 2:06pm

If I understand properly what’s going on in HistogramPanel I need to provide raw data set.
Before reading your response, I’ve created an histogram object:

but I have no idea how to get raw values of it or draw it.
I asked about it at Stack Overflow.

fabricebrito · December 20, 2016, 3:39pm

Hello!

I wonder if this notebook can help you: https://github.com/ec-everest/e-learning-modules/blob/develop/src/main/resources/e-learning/05%20Flood%20mapping%20with%20Sentinel-1.ipynb

Cheers

kedziorm · December 22, 2016, 3:07pm

Correct me if I am wrong, but you’re referring to the following line of code:
plt.hist(np.asarray(band_data, dtype='float'), bins=2048, range=[0.004, 0.3], normed=True)

Mentioned line of code uses matplotlib.pyplot and takes as an input a band_data.

My issue is that I wished to use histogram object:
Histogram = band_data.getStx().getHistogram()
and draw it uses native SNAP API.

So, I finally gave up and used matplotlib, but it was not my original question.

Cristina_Vrinceanu · August 22, 2018, 12:02pm

Hi there,

I am trying to do the same as user @kedziorm did, use the Histogram object to plot the histogram of a product. I need to mention that the product has been terrain corrected and I am trying to have the histogram of the Sigma0_VV band. I failed in calling the Histogram object and tried to do it in python instead, using matplotlib. The problem is that is that the histogram is not computing even though I get no error. It just runs for hours without any output. I get that the array is quite big (Height seems to be 10255 and Width 15704), but I still think this should not run for this long (more than 24 h).

My code is the following:
TCorr = ProductIO.readProduct(‘S1B_IW_GRDH_1SDV_20161015T165945_subset_TC.dim’)
tc= TCorr.getBand(‘Sigma0_VV_db’)
Width = tc.getRasterWidth()
Height = tc.getRasterHeight()
tc_array = np.zeros(Width*Height, dtype = np.float32)
tc.readPixels(0,0,Width, Height ,tc_array)
tc_array.shape = (Height, Width)
plt.hist(np.asarray(tc_array, dtype=‘float32’), bins=512, normed=True)
plt.xlabel(‘Sigma0_VV_db in intensity_db’)
plt.ylabel(‘Frequency in #pixels’)
plt.title(‘Histogram for Sigma0_VV_db’)
plt.axis([-25, 25, 0, 500000])
plt.grid(True)
plt.show()

I am also using this in Jupyter Notebooks, with Python 2. My computer is not that powerful either, memory is just 8GB. I have read that Jupyter is not very friendly with large datasets and it takes a while to compute and also Python 2 is a bit lazy on that, but do you think this may be the problem? I have also tried to perform the same step in an IDE and it still didn’t compute anything for a long while. Or maybe is simply my code, though. I have tried with other libraries as well (e.g plotly) and it still doesn’t compute anything after a long wait.

Any ideas?

Cheers,
Cristina

jmendrok · August 23, 2018, 11:29am

Hi Cristina,

to test whether it’s your code in general, I suggest to try it out on a subset/slice of your big tc_array. If you find your code is fine and big data is indeed the problem, you could compute histograms separately for individual subsets and sum them up afterwards (that is, catching and keeping the output of plt.hist; see plt.hist doc). Particularly when doing so, I’d also rather setup the histogram bins beforehand and not let plt.hist determine that by itself.

that’s how I would try…

wishes,
Jana

Cristina_Vrinceanu · August 24, 2018, 12:04pm

Hi Jana,

In the end this what I did. I sliced my numpy array by extracting only those values that were falling in the [0, -30] interval (as I have checked the values of my pixels in SNAP and most of them were under 0). I have managed to plot the histogram. I guess the problem was the size of the data, but I still don’t understand if this was for sure. My initial array had 161044520 elements and the new one is 121587004 elements. There is a considerable difference of almost 40 mil elements, but I given the sizes, I don’t see why this was such a problem.

Anyways, thank you! I am still beginner level and learning a lot, so any advice is much appreciated.

Cheers,
Cristina

jmendrok · August 29, 2018, 9:52am

Hi Cristina,

what you did rather sounds like masking instead of slicing. With slicing I mean dividing your big image in a number of smaller parts (e.g. handing over tc_array[i*Height/nslices:(i+1)*Height/nslices,:] to plt.hist and caching its output in a loop over i). but if your way worked, that’s fine, too, I guess

with S1 and S2 data sizes just at the brink of available memory of typical PCs, sometimes a little more or less big arrays to handle tip the memory over the edge in the one or other direction…

No worries about beginner level - we all started once (and even after >10yrs using python, I still google for how to do things quite often. and often rather ‘trivial’ stuff. stackoverflow has helped me innumerable times.) and you seem to find you way pretty well

wishes,
J.