Mosaic operator fails with a large image

We have used in Forestry TEP the following graph (via gpt) to mosaic and re-project Sentinel-2 imagery:

<graph id="S2_mosaic">
<version>1.0</version>
<node id="Mosaic">
    <operator>Mosaic</operator>
    <sources>
        <sourceProducts>${sourceProducts}</sourceProducts>
    </sources>
    <parameters>
        <variables>
            <variable>
                <name>B2</name>
                <expression>B2</expression>
            </variable>
            <variable>
                <name>B3</name>
                <expression>B3</expression>
            </variable>
            <variable>
                <name>B4</name>
                <expression>B4</expression>
            </variable>
            <variable>
                <name>B8</name>
                <expression>B8</expression>
            </variable>
        </variables>
        <combine>OR</combine>
        <crs>${epsg}</crs>
        <orthorectify>false</orthorectify>
        <resampling>Bilinear</resampling>
        <northBound>${northBound}</northBound>
        <southBound>${southBound}</southBound>
        <eastBound>${eastBound}</eastBound>
        <westBound>${westBound}</westBound>
        <pixelSizeX>${targetResolution}</pixelSizeX>
        <pixelSizeY>${targetResolution}</pixelSizeY>
    </parameters>
</node>
</graph>

With small areas or 20-m pixel spacing, this works. When we try to apply it over a 100-km tile of Sentinel-2 data (bands B2,B3,B4,B8) with pixel spacing of 10 m in Linux computers, it does not complete. On my desktop Linux computer (16 GB of memory), the CPU load approaches 100 percent, and the fans inside the computer start making clearly elevated noise. Memory allocation of process “java” is slightly over 10 GB, which is somewhat more than the expected size of the output image (64-bit, 8-band Geotiff-BIGTIFF file - the process writes the beginning part of about 20 MB, so I know these file details). The operating system does not start swapping activity. Input is a 4-band Geotiff-BIGTIFF file. On the F-TEP servers, I cannot hear the noise, but the job does not complete within half a day.

Trying the equivalent mosaicking operation over the same image data but using the GUI on my Windows computer produces the output image in about an hour. gpt refuses to co-operate in this computer, so I cannot test if it is an operating-system related problem.

Does anybody know, what could be the problem ? SNAP version is 5.0.8 in my Linux computer and 5.0.0 in the F-TEP servers.

Ciao
Yrjö

I tested with SNAP 6.0.0 also: the same result, no output in any decent time (just the tiff header in about an hour) using gpt. I also tested with the SNAP GUI: Raster/Geometric Operations/Mosaicing. The run completed in about an hour producing an 8-band Geotiff-BIGTIFF file (64 bits per pixel/band) of 5 878 602 329 bytes (almost 6 GB).

Output of tiffdump (after SNAP GUI, the gpt version had TileOffsets and TileByteCounts zeroed):

mosaicLin.tif:
Magic: 0x4d4d <big-endian> Version: 0x2b <BigTIFF>
OffsetSize: 0x8 Unused: 0
Directory 0: offset 16 (0x10) next 0 (0)
ImageWidth (256) SHORT (3) 1<9123>
ImageLength (257) SHORT (3) 1<9887>
BitsPerSample (258) SHORT (3) 8<64 64 64 64 64 64 64 64>
Compression (259) SHORT (3) 1<1>
Photometric (262) SHORT (3) 1<1>
SamplesPerPixel (277) SHORT (3) 1<8>
XResolution (282) RATIONAL (5) 1<1>
YResolution (283) RATIONAL (5) 1<1>
ResolutionUnit (296) SHORT (3) 1<1>
TileWidth (322) SHORT (3) 1<368>
TileLength (323) SHORT (3) 1<624>
TileOffsets (324) LONG8 (16) 400<23129 14719577 29416025 44112473
58808921 73505369 88201817 102898265 117594713 132291161 146987609
161684057 176380505 191076953 205773401 220469849 235166297 249862745
264559193 279255641 293952089 308648537 323344985 338041433 ...>
TileByteCounts (325) LONG8 (16) 400<14696448 14696448 14696448
14696448 14696448 14696448 14696448 14696448 14696448 14696448
14696448 14696448 14696448 14696448 14696448 14696448 14696448
14696448 14696448 14696448 14696448 14696448 14696448 14696448 ...>
SampleFormat (339) SHORT (3) 8<3 3 3 3 3 3 3 3>
34264 (0x85d8) DOUBLE (12) 16<10 0 0 305521 0 -10 0 6.8922e+06 0 0 0 0 0 0 0 1>
34735 (0x87af) SHORT (3) 24<1 1 2 5 1024 0 1 1 1025 0 1 1 1026 34737 22 22 3072 0 1 32635 3073 34737 22 0>
34737 (0x87b1) ASCII (2) 45<WGS 84 / UTM zone 35N|WG ...>
65000 (0xfde8) ASCII (2) 16081<<?xml version="1.0" enco ...>

Could it be that - despite writing a valid BIGTIFF header - the Mosaic module used by gpt is not handling 64-bit variables properly ? Should the GUI/Mosaicing use the same operator as gpt when given the graph at the beginning of this thread ?

Ciao
Yrjo

The theory about possible 64-bit problems seemed to be wrong. I made today a test by shrinking the area of interest so that the output file is about 1) 2.05 GB, and 2) about 3.8 GB. Test 1) went smoothly, but test 2) did not complete. So, it seems that SNAP/gpt/Mosaic has some internal bottle-neck that prevents its use on large images. Maybe we must make the re-projection/mosaicking operations outside SNAP/gpt e.g. by using gdalwarp and gdal_merge.py (which are not perfect either when fed with large data volumes, but they should work with this size of dataset).

Ciao
Yrjö

It might be that the cause of this issue is the GeoTiff format. We have also seen some problems with this.
Would it be possible for you to try the same with the BEAM-DIMAP format and only change the format to GeoTIFF as last step.
I know that we have created bigger mosaic images as 10000x10000. So I think the problem is not the Mosaic operation it’s self.

That processing behaves differently in the GUI and on the command line has been see before (SNAP-640). Hopefully we can address it in the next dev cycle.