Missing data in output when using file tile cache on Spark

Stijn · May 7, 2020, 8:05am

Hi,

We’re experiencing some issues when we try to run GPT with the snap.gpf.useFileTileCache=true option on a Spark cluster. We use it to generate SLC Coherence products, but sometimes a part of the data is missing (as can be seen in the top-left corner of the picture below). The missing data problem doesn’t occur consistently, even on the same input data. So that rules out corrupt input data.

At first we thought that this might be related to the location of the file tile cache, and that Spark was removing files from the /tmp directory that were in use by the processing. So we tried to set the tmpdir inside the current working directory so that it doesn’t interfere with other processes. But still we’re seeing the same issue.

Here is the full list of parameters that we use: -XX:MaxHeapFreeRatio=60 -Dsnap.userdir=. -Dsnap.dataio.bigtiff.tiling.height=256 -Dsnap.dataio.bigtiff.tiling.width=256 -Dsnap.jai.defaultTileSize=128 -Dsnap.jai.tileCacheSize=6000 -Dsnap.dataio.bigtiff.compression.type=LZW -Dsnap.parallelism=16 -Dsnap.gpf.useFileTileCache=true -Djava.io.tmpdir=./tmp

Do we have to set any other parameters or do you have an idea what could be the cause of this problem?

Thanks in advance!

AVollrath · May 7, 2020, 9:16am

Hi Stijn,

unfortunately I cannot give you a solution, but I can confirm that I see similar things happening (although not on a Spark cluster). One curiosity, do you process burst by burst, or the full frame?

Best,
Andreas

Stijn · May 7, 2020, 12:50pm

Hi Andreas,

I’m new to SNAP and just working on the operational integration of the processing workflow. So I don’t know all the details, but here’s what our processing graph looks like. Hope this answers your question.

Regards,
Stijn

AVollrath · May 7, 2020, 7:16pm

Ok, you probably processing the full frame. You might not need the ESD step for coherence only. It is only necessary for Interferograms and is quite slow.

best,
Andreas