GPT OperatorException with Remove Thermal Noise

Hi all,

We are experiencing some intermittent issues with the Graph Processing Tool reporting “Error: org.esa.snap.core.gpf.OperatorException” from the Sentinel1RemoveThermalNoiseOp operator.

This problem is not reliably repeatable, and can occur on input files which have previously succeeded. Running the same graph and input through again after a failure will then not cause the error.

This issue likely does not exist in S1TBX v6.0.4 and earlier, as we have not experienced this problem before.


SNAP v6.0.9
S1TBX v6.0.6
snap.properties: snap.parallelism = 8
gpt.vmoptions: -Xmx30G

Normal output:

INFO: org.esa.snap.core.gpf.operators.tooladapter.ToolAdapterIO: Initializing external tool adapters
SEVERE: org.esa.s2tbx.dataio.gdal.activator.GDALDistributionInstaller: The environment variable LD_LIBRARY_PATH does not contain the current folder ‘.’. Its value is ‘/usr/local/lib’.
Executing processing graph
INFO: org.hsqldb.persist.Logger: dataFileCache open start
version = 2.91
…10%…20%…30%…40%…50%…60%…70%…80%…90% done.

Error output:

INFO: org.esa.snap.core.gpf.operators.tooladapter.ToolAdapterIO: Initializing external tool adapters
SEVERE: org.esa.s2tbx.dataio.gdal.activator.GDALDistributionInstaller: The environment variable LD_LIBRARY_PATH does not contain the current folder ‘.’. Its value is ‘/usr/local/lib’.
Executing processing graph
INFO: org.hsqldb.persist.Logger: dataFileCache open start
version = 2.91
90% done.
org.esa.snap.core.gpf.OperatorException
at org.esa.snap.core.gpf.graph.GraphProcessor$GPFImagingListener.errorOccurred(GraphProcessor.java:363)
at com.sun.media.jai.util.SunTileScheduler.sendExceptionToListener(SunTileScheduler.java:1646)
at com.sun.media.jai.util.SunTileScheduler.scheduleTile(SunTileScheduler.java:921)
at javax.media.jai.OpImage.getTile(OpImage.java:1129)
at javax.media.jai.PlanarImage.getData(PlanarImage.java:2085)
at com.bc.ceres.glevel.MultiLevelImage.getData(MultiLevelImage.java:64)
at org.esa.snap.core.gpf.internal.OperatorContext.getSourceTile(OperatorContext.java:407)
at org.esa.snap.core.gpf.internal.OperatorContext.getSourceTile(OperatorContext.java:393)
at org.esa.snap.core.gpf.internal.OperatorImage.computeRect(OperatorImage.java:73)
at javax.media.jai.SourcelessOpImage.computeTile(SourcelessOpImage.java:137)
at com.sun.media.jai.util.SunTileScheduler.scheduleTile(SunTileScheduler.java:904)
at javax.media.jai.OpImage.getTile(OpImage.java:1129)
at com.sun.media.jai.util.RequestJob.compute(SunTileScheduler.java:247)
at com.sun.media.jai.util.WorkerThread.run(SunTileScheduler.java:468)
Caused by: org.esa.snap.core.gpf.OperatorException
at org.esa.s1tbx.calibration.gpf.Sentinel1RemoveThermalNoiseOp.computeTile(Sentinel1RemoveThermalNoiseOp.java:630)
at org.esa.snap.core.gpf.internal.OperatorImage.computeRect(OperatorImage.java:80)
at javax.media.jai.SourcelessOpImage.computeTile(SourcelessOpImage.java:137)
at com.sun.media.jai.util.SunTileScheduler.scheduleTile(SunTileScheduler.java:904)
… 11 more

Error: org.esa.snap.core.gpf.OperatorException

I notice that it only presents “90% done.”, rather than the full progress bar. However, it does not create any output data.

I’ve done a bit of investigating and believe that the Thermal Noise operator is crashing during the populateNoiseMatrixForTOPSGRD function.

Modifying the code and printing the stack trace suggests that faulty line is:

final double firstLineTime = t0Map.get(imageName);

However, I’m still not sure what causes this to fail so irregularly. Most of the time, it correctly generates a consistent value for all the final variables in this function.

1 Like

Thank you for the analysis Anthony. @lveci @jun_lu

Thanks. I haven’t been able to reproduce the problem. The map may not be thread safe.

Thanks for looking into this! It only occurs for me infrequently so it is difficult to reproduce.

Yes, you are probably correct. I have just added a print statement to the code, and it did indeed return a few of these lines to the output immediately after the exception occurred, suggesting that it is a particular thread that has failed.

I’ve had the problem occur on both 8 and 36 core CPUs. However, I’ll probably convert this operator to run with a single core for now.

@lveci I’ve edited the code to use instances of java.util.concurrent.ConcurrentHashMap, which is apparently more thread safe.

I’ve tested looping the operator and it has not failed so far. Does this sound like an appropriate change? Should I suggest a pull request?