S3 OLCI EFR poor dynamic range

glemoine · November 8, 2017, 9:00am

Hi,

I am looking at S3 OLCI EFR data which Google now ingests into Google Earth Engine (see S3 EFR in Google Earth Engine). Simon told me that data is ingested “as is” (i.e. no changes made to the data after download).

I am puzzled about the [very] poor dynamic range of the OaNN_radiance values which, although they are stored as signed int32 (!), are typically in the range 50-300 (over land). Is there any (instrument) reason for this? Why is the effective range not used, even if you would consider using Uint16? Indices (e.g. NDVI) derived from the current radiance (or reflectance) values will show a poor dynamic range as well.

(NB we had a similar issue with early S2, which was poorly scaled).

Guido

Here’s an example for a pixel in Niger on 21 June (i.e. summer solstice)

simonf · November 8, 2017, 11:44pm

Note that the int32 tiffs are produced by the toolbox. The original .nc file contain UInt16 netcdf files.

glemoine · November 9, 2017, 9:16am

Ok, s3tbx conversion from uint16 to int32 is a bad choice. But dynamic range is equally poor in the .nc files (which don’t work with GDAL, you need netcdf-bin). Each band has a radiance_scale factor (a float, same in .nc and .tiff) which is apparently set too high, so that everything gets compressed in a limited range. So, looks like a processing or sensor issue.

NB. Lossless uint16 compressed tiff is half the size of the .nc version (1/4th the s3tbx output)

marpet · November 28, 2017, 2:49pm

Your values look like a very nice TOA radiance spectrum. Maybe the values don’t need to be scaled anymore?
I don’t know how the tiff files are created.

glemoine · November 30, 2017, 11:00pm

Python code below for .nc to uint16 .tif conversion. I received an answer from EO Helpdesk, which wasn’t very clear. Will post that tomorrow.

from netCDF4 import Dataset
import sys
import numpy as np
from osgeo import gdal

fname = sys.argv[1]
nc_in = Dataset(fname,'r',)

b = nc_in.variables[fname.replace('.nc', '')][:][:].astype(np.uint16)

#print nc_in.variables
nodata = nc_in.variables[fname.replace('.nc', '')]._FillValue.astype(np.uint16)

print nodata
print b.shape, b.dtype

print np.min(b), np.max(b)

driver= gdal.GetDriverByName('GTiff')
ds = driver.Create(fname.replace('.nc', '.tif'), 
	b.shape[1], b.shape[0], 1, gdal.GDT_UInt16, )

outband = ds.GetRasterBand(1)
outband.SetStatistics(float(np.min(b)), float(np.max(b)), np.average(b[np.where(b < nodata)]), np.std(b[np.wher
e(b < nodata)]))
outband.WriteArray(b.astype(np.uint16))

marpet · December 1, 2017, 8:10am

I don’t know the the NetCDF python API well, but from a quick look at the code it could be that in this line

b = nc_in.variables[fname.replace('.nc', '')][:][:].astype(np.uint16)

the scaled float32 radiance values are turned into uin16 by just cutting off the decimal fraction. It could be that the library scales the data automatically

glemoine · December 1, 2017, 12:26pm

Thanks Marco!

This was the answer from EO support:

According to a similar question posted by yourself on the STEP portal (http://forum.step.esa.int/t/s3-olci-efr-poor-dynamic-range/7755) with more details, it seems that there is a confusion between values extracted from products using appropriate interfaces (that includes data unpacking) and internal coding within the netCDF files.
Indeed, data are coded inside the netCDF files on 16 bits integers, i.e. within [0,65533], 65534 (=216-1) representing the “no data” value. 
 
However the values reported by the user as  “between 50 and 300” are radiance values after unpacking, i.e. after having applied the scaling factors embedded in the netCDF files. This interpretation is confirmed by looking at the spectrum of a pixel over Niger provided by the user in his post, in which one can see typical TOA radiance values over bare soils. The scaling factors used in OLCI Level 1 radiances packing are derived as to accommodate the instrument actual dynamic range.

So, now I understand that I would have to actually read in the data from the .nc file as floats, and do some more sophisticated scaling if I want to use the effective range of an uint16. I’ll try that out and let you know if this solves the issue.

Guido

glemoine · December 3, 2017, 4:45pm

OK, I think I solved this now. The .nc file is read in as float32 (it knows how to multiply the uint16 values with the scale factor). So, in order to get back a uint16 GeoTIFF, you need to divide by the scale factor. The GeoTIFF is slightly larger than the .nc, but that’s OK. Probably depends on the compression setting.

from osgeo import gdal

fname = sys.argv[1]
nc_in = Dataset(fname,'r',)

b = nc_in.variables[fname.replace('.nc', '')][:][:] 

#print nc_in.variables
nodata = nc_in.variables[fname.replace('.nc', '')]._FillValue.astype(np.uint16)
scalef = nc_in.variables[fname.replace('.nc', '')].scale_factor
offset = nc_in.variables[fname.replace('.nc', '')].add_offset

print nodata, scalef, offset, scalef*nodata
print b.shape, b.dtype

print np.min(b), np.max(b)

driver= gdal.GetDriverByName('GTiff')
ds = driver.Create(fname.replace('.nc', '.tif'), 
	b.shape[1], b.shape[0], 1, gdal.GDT_UInt16, options=["TILED=YES", "COMPRESS=DEFLATE"])

b = b/scalef
outband = ds.GetRasterBand(1)
outband.SetStatistics(float(np.min(b)), float(np.max(b)), float(np.average(b[np.where(b < nodata)])), float(np.std(b[np.where(b < nodata)])))
outband.WriteArray(b.astype(np.uint16))

ds = None