I am processing S2A data to LAI and FVC using the biophysical processor within SNAP. I have saved my LAI band, and will save the FVC bands to seperate geotiffs. One of the outputs will be an LAI and FVC band with flags and one will be with flags masked out of the product. I am wondering where I can find information on what the flag codes mean? I thought there were 8 flags as described in the Index Codings>quality_scene_classification but, when opening an exported LAI with flags geotiff I have noticed band 2 had a value of 16 over a few pixels, which makes me question if I am looking at the correct flag codings, or if there is a different dataset I need to be looking at. I would be interested in the same information for the FVC output if the flags are different.
Also, I noticed some datasets contain values, not flagged, outside of the valid swath range. I am wondering what these values represent and why they aren’t flagged as no data or masked out as no data pixels?
I believe I have found what I am looking for. Under the Flag Codings is both the lai_flags and fcover_flags descriptors. This is exactly what I am looking for; however, I am still unsure as to why pixels outside of the detector swatch are still being classified as valid and have actual values, or not being flagged, when it is clear they are not within the actual swatch of the sensor. Also, is there somewhere I could find a more detailed description of the flags for fcover and lai? I am considering including areas which have been flagged with code 2 and 4 (as they are within tolerance) within my output, but want to make sure this is best practice before doing so.
The Biophysical Processor uses binary coding to flag pixels that are outside the definition range. Binary coding is explained in the collection of FAQs (Link).
The pixel values in the flag bands have the following meaning:
1 = Input is out of definition domain
2 = Output is lesser than minimum output, but within tolerance
3 = Combination of 1+2
4 = Output is greater than maximum output, but within tolerance
5 = Combination of 1+4
8 = Output is too low
9 = Combination of 1+8
16 = Output is too high
17 = Combination of 1+16
I am not aware of any document that discusses as when to in- or exclude areas that have been flagged by the Processor. I’d also be happy if somebody (maybe @MartinoF?) could contribute to this discussion.
The Algorithm Theoretical Based Document (Link) explains the definition domains from page 44 onwards.
I appreciate the detailed description of the flag codes, especially the combinations. I did not find that within the documentation. I would also be interested in furthering the discussion on when to exclude these flagged sites.
Regarding the areas outside of the instrument swath, I am wondering why these areas are not flagged and contain valid pixels? It doesn’t occur in every image, but in some, the outer areas of the imagery contain what appears to be valid pixel values for the lai and fcover bands. The values are usually quite low, but are not being flagged and are clearly outside of the viable swath or region which is collecting meaningful data. If it is difficult to understand what I am referring to, I can upload a screen-cap of a processed S2 image if it is permitted. I think one could clip the data, but I don’t understand why these pixels outside the viable swath are not being either flagged or removed as invalid or outside the data range. Any ideas on how to handle this issue would be appreciated.
The issue seems to be caused by the two input bands view_zenith_mean and view_azimuth_mean, as they contain valid pixel values beyond the area of the instrument swath.
Oddly enough, the Biophysical Processor generates numerical output even if all spectral bands contain no numbers (NaN), and only outputs NaN if all spectral bands + view_zenith_mean + view_azimuth_mean were set to NaN.
In my case, however, those pixels were flagged as being out of definiton range.
One workaround could be to re-define the NoData area for all bands (using Masks>Land/Sea Mask where the appropriate mask should be selected under Use Vector as Mask. If relying on Sen2Cor, the scl_nodata mask could work, or the edge_mask_R1 if relying on MAJA correction) after resampling the entire scene to a common spatial resolution, and before inputting the data into the Processor.
The information you have provided certainly helps.
I took a closer look at the data and found the data outside the viable instrument swath area to actually contain very low values, but unfortunately they are not low enough to be isolated from valid pixels; otherwise I was planning to import the geotiff into R and set pixels with values less than threshold set to NaN. I have tried your suggestion of using the Raster>Masks>Land/Sea Mask and using the scl_nodata vector mask (making sure to invert selection) to mask out pixels outside of the valid detector swath. I am curious why these areas in your data are flagged, and this is not the case in my data? I am wondering if it has something to do with the level of pre-processing that was done within Sen2Cor between our datasets?
To make sure that we are on the same page, I uploaded a screenshot below. It shows the same scene twice:
The upper window pane contains a MAJA-corrected scene, where the zig-zag patterned, whitish area lies just outside the instrument swath and exhibits LAI values ~7. The lower window pane contains a Sen2COR-corrected scene, where the zig-zag patterned, light red area exhibits LAI values ~0.8.
In both cases, resampling to 10m-resolution was the only preprocessing step taken.
The red, transparent layer indicates those pixels that were flagged as ‘out of definition domain’. In my case, all pixels outside the instrument’s swath were flagged, independent of the atmospheric correction.
As you can see, the entire area outside the viable detector domain is calculated as valid LAI data; you can reference this on the left panel. The top image within the matched image view contains a processed lai dataset, and the below image is the view_zenith_mean band within the resampled product. The view_zenith_mean within the area outside the viable detector swath is 0.0. As you had mentioned, applying a mask by inverting the scl_nodata band has removed this area from the final product.
I faced the same problem while exporting the LAI in GeoTIFF and open it in QGIS. Band 2 contains 0 and 1 values for a study area and the other study area contains 0, 2, 8, and 16 values in band2.
My biggest problem is what is the concept of this band 2.
Basically, what do the different flags mean?
how can these flags affect the results?
Can we still trust this LAI? or can we enhance the accuracy of the LAI using band2?
I would appreciate it if someone makes it clear because I searched a lot and still could not find a good answer.
Concerning the meaning of the different pixel values in your flag band, please have a look at my reply above.
To my knowledge, the Biophysical Processor basically accepts any range of pixel values as input, eventually outputs a LAI value for every pixel, and notifies the user in the flag band if the input/output value(s) were consistent with the values contained in the database used to train the neural network.
Reasons for violation of the quality criteria (and thus flagged pixels) can be manifold, such as:
Input pixel is not a vegetation pixel / is a mixed pixel
Contamination by cloud / cloud shadow / snow / ice / water (I’ve had issues with crop fields that were partly inundated)
Poor atmospheric correction
Vegetation type and/or local soil characteristics were not adequately represented in the variable distribution of the underlying radiative transfer model
I believe that there is no clear answer of how to handle flagged pixels. Know your study area, know the vegetation type that you are investigating, have a rough idea about phenology and the range of feasible LAI values, check the original S2 scene and try to find an explanation for the flag(s), then make a decision.
It’s probably a safe bet to remove flagged pixels from your analysis, especially if they only constitute a minority of your study area.
If your image contains a lot of flags that can’t be explained, chances are high that conditions in the study area (vegetation/soil) were not adequately represented in the distribution of the input variables for the radiative transfer model. However, you might still be able to study relative differences.
Keep in mind that the Biophysical Processor has been developed based on a compilation of data that might not realistically represent the actual distribution of values around the globe. It aims to provide reasonable good estimates for most vegetation types and conditions that can be observed worldwide.
The only way you can enhance the accuracy of the LAI estimates is by developing your own model for your vegetation type of interest.