Memory usage, computed bands and databuffers

Hi all,

I have been struggling with some memory issues for a while now. I have followed this related topic closely, but it hasn’t helped me in the way I had hoped. Also, perhaps my question is a bit more fundamental.

My question is the following:

What is the difference between a user computed band and an original one, in terms of memory usage?

Consider the following script in python:

import numpy
from snappy import Product
from snappy import ProductIO
from snappy import ProductData
from snappy import jpy

# Read product
product = ProductIO.readProduct(path)

# Set BandMaths expression
expression = "B2 + B3 + B4"

# Create new band
product.addBand('X', expression, ProductData.TYPE_FLOAT32)

# Get the band
band = product.getBand('X')

# Of course we need the raster size
width = band.getRasterWidth()
height = band.getRasterHeight()

# The following code is an example of one in which the error occurs, but is surely not
# limited to it. It pops up in any operation involving reading the raster data.

array = numpy.zeros((width,height), dtype=numpy.float32)

for row in range(height):
    band.readPixels(0, row, width, 1, array[row])

This returns an error: RuntimeError: java.lang.RuntimeException: Cannot construct DataBuffer

Surely this is related to some memory issue of sorts. However, if I set ‘X’ to be one of the original bands (say, B2), everything works fine. This tells me there is some fundamental difference between the two.

I have tried many different approaches to solving this problem:

  • Use bandMathsOp operators to create new band.
  • Convert computed band to ‘real’ band (see this thread)
  • Set the bits per pixel to a lower number.
  • Read in tile by tile.
  • Use FileTileCache.
  • Set expression to just a single band, e.g. ‘B2’.
  • Many more…

I have also tried to solve the lack of memory issues according to all proposed methods in the aforementioned thread.

  • Adjust JVM parameters.
  • Adjust Snappy config files
  • Disable, enable filetilecache
  • Etc.

I’m curious to hear about your ideas.

Cheers,

Danny

Wow. You’ve tried already a lot. I can image how frustrating this is.
And I really wonder why nothing helped. At least increasing the memory should have helped.

We know that we have some memory issues, and we want to address them for the next release, but your example should actually work.

Before I come to another possible solution I explain the difference between the computed band and the original band.
If you use the original band it can be read directly from the file.
For the computed band all three bands need to be loaded into memory. Also the difference is the data type. For S2 MSI the raw data is 16bit integer, the computed one is float32.
You said you also tried setting the expression to just B2 which didn’t work neither. So also here is an additional layer (the virtual band) of used memory compared to using the band directly.

However, maybe there is somewhere a problem in the code. But we don’t observe this error in general. Only when special operators are used on the command line for example.

The afore mentioned additional solution is to use the associated image with the band.
You can call

band.getGeophysicalImage().getData().getPixels(x, y, w, h, array)

Maybe this is more memory efficient.

Otherwise I’ve no idea what else you can do.

Thanks for the reply!

The differences between a computed band and a real band are indeed clear. This is why I used this topic to convert the computed band to a real band, but to no avail.

Surely most computed bands would need the data type to be float, considering computations will involve either division or multiplication by some floating point number. Still I would agree that the data type must be the culprit. I have now set the targetband data type to int16 (i also tried 8). It trips the error a bit further into the processing, but the result is the same: Cannot construct databuffer. Also, 16 bits would not suffice in any calculation more intricate.

What is especially striking to me is that this does not work when I set the expression to simply ‘B2’, set data type to int16 and convert the computedband to a real band. The difference between the two must be incredibly subtle, because for example the ImageInfo shows now differences, yet the error persists.

The suggested command gives the same result. I am using the above example pretty much ‘as is’, so an error in the code seems unlikely to me, but I happily accept any which are pointed out.

Do you have any other suggestions for possible workarounds?

Thanks a lot.

Danny

Something just occurred to me.

Is it possible that it has to do with reading the data from memory? I am no expert on memory, but is it possible that reading from memory is more costly than reading from a file?

No, actually it should be cheaper to read from memory.

Do you have the complete stacktrace for this?
This would allow to see when exactly the error occurs.

Hi Marco,

I doubt it is very helpful, but all I get is:

Traceback (most recent call last):
  File "bandMath.py", line 53, in <module>
    band.readPixels(0, row, width, 1, array[row])
RuntimeError: java.lang.RuntimeException: Cannot construct DataBuffer.

I had it also printing the line nr. up until it crashes. Curiously though, using int8 datatype it gets less far than using 16 bit integers.

I’m really stumped at this problem. :sweat:

I’ve tried to replicate the problem and I also got the “Cannot construct DataBuffer” error.
After I have jpyconfig.py I got it running. I’ve set the following:
jvm_max_mem = 8000000
The file is located in the snappy folder.
Considering that this single band is more than 4GB in memory (at least in my use case), it is not astonishing that there was an error before.

It is pretty slow but at least it runs. Honestly I haven’t run till the end. I wasn’t patient enough.

and I had to swap width and height in the numpy array initialisation.

array = numpy.zeros((height, width), dtype=numpy.float32)

Hi Marco,

Thanks for your suggestions. I have tried everything you have suggested and more, but it stills gives me the error 4/5 of the way through.

I think I will just give up on this approach and try finding workarounds instead.

Thanks again for your help.

Danny