Reprojecting S3 OLCI consumes insane amount of memory - on some scenes

My workflow is to use GPT to subset S3 OLCI products and then reproject them to the local UTM zone and 300 m resolution with something like

gpt reproject -Presampling=Bilinear -PpixelSizeX=300 -PpixelSizeY=300 -Pcrs=epsg:32719 -Ssource=<infile> -t<outfile>

On some scenes, this works fine. On others, however, my -Xmx32G is not enough and I get this error:

.INFO: org.esa.snap.core.gpf.operators.tooladapter.ToolAdapterIO: Initializing external tool adapters
.SEVERE: org.esa.s2tbx.dataio.gdal.activator.GDALDistributionInstaller: The environment variable LD_LIBRARY_PATH is not set. It must contain the current folder '.'.
INFO: org.hsqldb.persist.Logger: dataFileCache open start
INFO: org.esa.snap.core.gpf.common.WriteOp: Start writing product projected_S3A_OL_2_WFR____20180816T143016_20180816T143316_20180817T203805_0179_034_324_3240_MAR_O_NT_002 to /mnt/output/S3A_OL_2_WFR____20180816T143016_20180816T143316_20180817T203805_0179_034_324_3240_MA$
_O_NT_002.dim
Writing...
INFO: java.util.prefs.FileSystemPreferences$1: Created user preferences directory.
20%....30%....40%.... done.
Exception in thread "SunTileScheduler0Standard13" java.lang.NullPointerException
        at com.sun.media.jai.util.SunCachedTile.<init>(SunCachedTile.java:80)
        at com.sun.media.jai.util.SunTileCache.add(SunTileCache.java:257)
        at javax.media.jai.OpImage.addTileToCache(OpImage.java:1087)
        at javax.media.jai.OpImage.getTile(OpImage.java:1142)
        at org.esa.snap.core.gpf.internal.OperatorExecutor$OperatorTileComputationListenerStack.tileComputed(OperatorExecutor.java:310)
        at com.sun.media.jai.util.RequestJob.compute(SunTileScheduler.java:278)
        at com.sun.media.jai.util.WorkerThread.run(SunTileScheduler.java:468)
org.esa.snap.core.gpf.OperatorException: Cannot construct DataBuffer.
        at org.esa.snap.core.gpf.internal.OperatorExecutor$GPFImagingListener.errorOccurred(OperatorExecutor.java:376)
        at com.sun.media.jai.util.SunTileScheduler.sendExceptionToListener(SunTileScheduler.java:1646)
        at com.sun.media.jai.util.SunTileScheduler.scheduleTile(SunTileScheduler.java:921)
        at javax.media.jai.OpImage.getTile(OpImage.java:1129)
        at javax.media.jai.PlanarImage.getData(PlanarImage.java:2085)
        at com.bc.ceres.glevel.MultiLevelImage.getData(MultiLevelImage.java:64)
        at org.esa.snap.core.gpf.internal.OperatorContext.getSourceTile(OperatorContext.java:407)
        at org.esa.snap.core.gpf.internal.OperatorContext.getSourceTile(OperatorContext.java:393)
        at org.esa.snap.core.gpf.internal.OperatorImage.computeRect(OperatorImage.java:73)
        at javax.media.jai.SourcelessOpImage.computeTile(SourcelessOpImage.java:137)
        at com.sun.media.jai.util.SunTileScheduler.scheduleTile(SunTileScheduler.java:904)
        at javax.media.jai.OpImage.getTile(OpImage.java:1129)
        at com.sun.media.jai.util.RequestJob.compute(SunTileScheduler.java:247)
        at com.sun.media.jai.util.WorkerThread.run(SunTileScheduler.java:468)
Caused by: java.lang.RuntimeException: Cannot construct DataBuffer.
        at com.sun.media.jai.util.DataBufferUtils.constructDataBuffer(DataBufferUtils.java:132)
        at com.sun.media.jai.util.DataBufferUtils.createDataBufferFloat(DataBufferUtils.java:214)
        at javax.media.jai.ComponentSampleModelJAI.createDataBuffer(ComponentSampleModelJAI.java:271)
        at javax.media.jai.RasterFactory.createWritableRaster(RasterFactory.java:691)
        at javax.media.jai.PlanarImage.createWritableRaster(PlanarImage.java:1982)
        at javax.media.jai.PointOpImage.computeTile(PointOpImage.java:771)
        at com.sun.media.jai.util.SunTileScheduler.scheduleTile(SunTileScheduler.java:904)
        ... 11 more

Error: Cannot construct DataBuffer.

When I then set -Xmx128G, the RAM usage actually goes up to that staggering number and the job succeeds. A product that fails in this way is S3A_OL_2_WFR____20180816T143016_20180816T143316_20180817T203805_0179_034_324_3240_MAR_O_NT_002

Could be related to the issue of reproject creating ghost pixels and the image becomes huge - although I have never seem those pixels in the final output.

How can this task consume so much memory? And how can I fix this? I cannot allocate 128 GB of memory for every reprojection job…

By the way, 20%....30%....40%.... done. - is that output from GDAL running behind the scenes?

Coud this be related to the problem with Gpt multithreading? I do see GPT spawning dozens of processes. But that should only have a negative impact on CPU resources, not on memory, should it? At least it is not like as soon as the processes are up, all the memory gets allocated. It still grows gradually.

see GPT spawning dozens of processes. But that should only have a negative impact on CPU resources

That’s how it worked for me.
I’ve removed some info logging lines

>gpt reproject -Presampling=Bilinear -PpixelSizeX=300 -PpixelSizeY=300 -Pcrs=epsg:32719 -Ssource=“G:\EOData\SENTINEL3\OLCI\j0blue\S3A_OL_2_WFR____20180816T143016_20180816T143316_20180817T203805_0179_034_324_3240_MAR_O_NT_002.SEN3” -t “G:\EOData\temp\S3A_OL_2_WFR____20180816T143016.dim”
INFO: org.hsqldb.persist.Logger: dataFileCache open start
INFO: org.esa.snap.core.gpf.common.WriteOp: Start writing product projected_S3A_OL_2_WFR____20180816T143016_20180816T143316_20180817T203805_0179_034_324_3240_MAR_O_NT_002.SEN3 to G:\EOData\temp\S3A_OL_2_WFR____20180816T143016.dim
Writing…
…10%…20%…30%…40%…50%…60%…70%…80%…90%… done.
INFO: org.esa.snap.core.gpf.common.WriteOp: End writing product S3A_OL_2_WFR____20180816T143016 to G:\EOData\temp\S3A_OL_2_WFR____20180816T143016.dim
INFO: org.esa.snap.core.gpf.common.WriteOp: Time: 2764.966 s total, 568.221 ms per line, 0.110120 ms per pixel

And these are my settings:

>gpt --diag
SNAP Release version 6.0
SNAP home: C:\Program Files\snap\bin/…
SNAP debug: null
SNAP log level: null
Java home: c:\program files\snap\jre
Java version: 1.8.0_102
Processors: 12
Max memory: 9.8 GB
Cache size: 1024.0 MB
Tile parallelism: 12
Tile size: 512 x 512 pixels

And the results look quite reasonable.

I’m doing this on Windows and you on Unix. But actually, this shouldn’t make a difference.

The progress percentage numbers are from SNAP, not GDAL.
In your processing request no GDAL involved.

1 Like

Thanks a lot for testing! Hmm, my settings are

docker run dhigras/esa-snap --diag
INFO: org.esa.snap.core.gpf.operators.tooladapter.ToolAdapterIO: Initializing external tool adapters
SEVERE: org.esa.s2tbx.dataio.gdal.activator.GDALDistributionInstaller: The environment variable LD_LIBRARY_PATH is not set. It must contain the current folder '.'.
SNAP Release version 6.0
SNAP home: /usr/local/snap/bin//..
SNAP debug: null
SNAP log level: null
Java home: /usr/local/snap/jre
Java version: 1.8.0_102
Processors: 2
Max memory: 39.1 GB
Cache size: 1024.0 MB
Tile parallelism: 2
Tile size: 512 x 512 pixels

It is not obvious that higher paralellism (you use 12 processors, I use 2) should resolve this, but I’ll try.

No, actually higher parallelism will cause higher memory consumption.

Just for the record: We never found a way to resolve the issue with exploding memory consumption and there were more problems with reprojection (e.g. extrapolation outside of domain), so we coded up our own reprojection algorithm going from the NetCDF lat/lon grid of Sentinel 3 to UTM coordinates, which is only a subset of what SNAP’s reprojection can do, but it works more reliably.

Could you share your solution?
I have 128GB on the production machine, so it works using up to ~96GB of memory during a reprojection of S3 synergy product, but I would like to see a solution where I could split this up in smaller tasks.