Metadata for Sentinel2

I need to have access only to metadata (e.g., Cloud cover percentage) of Sentinel-2 for the entire Canada since the date of satellite launch, WITHOUT downloading the actual datasets.

I have been looking at https://scihub.copernicus.eu/userguide/5APIsAndBatchScripting
but still no idea what is the best convenient way to do so. Any idea is truly appreciated.

Thanks

I think this is not easily doable.
The L1C data does not contain this information you would need the L2A data. But I think not all data has been processed to this level till now. So I think you will not have a good coverage.
However, if it is possible to download only the metadata file (MTD_MSIL2A.xml) you can analyse the XML.
It has a section like this one:

  <n1:L2A_Quality_Indicators_Info>
    <Image_Content_QI>
      <NODATA_PIXEL_PERCENTAGE>0.000070</NODATA_PIXEL_PERCENTAGE>
      <SATURATED_DEFECTIVE_PIXEL_PERCENTAGE>0.000000</SATURATED_DEFECTIVE_PIXEL_PERCENTAGE>
      <DARK_FEATURES_PERCENTAGE>8.940079</DARK_FEATURES_PERCENTAGE>
      <CLOUD_SHADOW_PERCENTAGE>1.408543</CLOUD_SHADOW_PERCENTAGE>
      <VEGETATION_PERCENTAGE>29.733669</VEGETATION_PERCENTAGE>
      <NOT_VEGETATED_PERCENTAGE>30.563015</NOT_VEGETATED_PERCENTAGE>
      <WATER_PERCENTAGE>21.459070</WATER_PERCENTAGE>
      <UNCLASSIFIED_PERCENTAGE>6.169005</UNCLASSIFIED_PERCENTAGE>
      <MEDIUM_PROBA_CLOUDS_PERCENTAGE>0.636100</MEDIUM_PROBA_CLOUDS_PERCENTAGE>
      <HIGH_PROBA_CLOUDS_PERCENTAGE>0.496362</HIGH_PROBA_CLOUDS_PERCENTAGE>
      <THIN_CIRRUS_PERCENTAGE>0.018557</THIN_CIRRUS_PERCENTAGE>
      <CLOUD_COVERAGE_PERCENTAGE>1.151019</CLOUD_COVERAGE_PERCENTAGE>
      <SNOW_ICE_PERCENTAGE>0.575596</SNOW_ICE_PERCENTAGE>
      <RADIATIVE_TRANSFER_ACCURAY>0.0</RADIATIVE_TRANSFER_ACCURAY>
      <WATER_VAPOUR_RETRIEVAL_ACCURACY>0.0</WATER_VAPOUR_RETRIEVAL_ACCURACY>
      <AOT_RETRIEVAL_ACCURACY>0.0</AOT_RETRIEVAL_ACCURACY>
    </Image_Content_QI>
  </n1:L2A_Quality_Indicators_Info>
1 Like

Hi SAR2016
My little python tool Sentinel-download can retrieve the cloud content form Sentinel-2 Hub catalog


Olivier

1 Like

Thanks marpet and Oliver,

I ve run sentinelsat on Linux

but now I see again the entire dataset are being downloaded.
Oliver, let me try your script. For Canada Level-1c is only available. We just need to get the cloud coverage in time series data since SEntinel 2 was launched.

marpet,

Do you think it is doable Only download the metadata for the entire Level-1c dataset of Canada? We are only interested in cloud coverage percentage.Thanks again

Oliver,

By the way, do I need to download the entire dataset or I can just have access to metadata for the cloud coverage? We are trying to avoid the data download itself.
Thanks.

I don’t know. I don’t know the scripts and the capabilities of scihub. But the metadata of L1C will not help you. It doesn’t contain cloud statistics. Only the L2A do.

1 Like

marpet,

for the testing, I have downloaded a tile in Canada , and it shows "NominalCloudCoverPercentage. Is this sth different? This is 1st time I am using Sentinel2 for Canada. Thanks
"

Consider getting a Google Earth Engine account and then run this:

var s2 = ee.ImageCollection(‘COPERNICUS/S2’)

var aoi = ee.FeatureCollection(“USDOS/LSIB/2013”).filterMetadata(‘cc’, ‘equals’, ‘CA’);

// Make date selection to keep list reasonably short
var s2_CA = s2.filterBounds(aoi).filterDate(‘2017-06-01’, ‘2017-06-02’)

// Prepare a table with image ID and cloudy_pixel_percentage. Last argument = false drops the geometry
// Change to true if you want image footprint geometry
var list = ee.FeatureCollection(s2_CA).select([‘system:index’, ‘CLOUDY_PIXEL_PERCENTAGE’], null, false)

// Print a sample (first 10 records)
print(list.limit(10))

Export.table.toDrive({collection: list, description: ‘CA_cloud_stats’, fileFormat: ‘CSV’})

// If footprint included (false -> true in line 10)
//Export.table.toDrive({collection: list, description: ‘CA_cloud_stats’, fileFormat: ‘KML’})

The export produces this CSV file:
system:index CLOUDY_PIXEL_PERCENTAGE .geo
20170601T143831_20170601T143830_T21TXN 100
20170601T143831_20170601T143830_T21TYM 91.052
20170601T143831_20170601T143830_T21TYN 99.9425
20170601T143831_20170601T143830_T21UXP 100
20170601T143831_20170601T143830_T21UXQ 100
20170601T143831_20170601T143830_T21UYP 99.8121
20170601T143831_20170601T143830_T21UYQ 99.1684
20170601T143831_20170601T143830_T21UYR 100
20170601T143831_20170601T143830_T22TCS 85.8623
20170601T143831_20170601T143830_T22TCT 91.0248
20170601T143831_20170601T143830_T22UCA 99.8204
20170601T143831_20170601T143830_T22UCU 99.6824

(314 records just for one day, so lots for the entire S2 period]. Generate as KML and you get even a spatial idea.

Guido

2 Likes

I’m not aware that this is somewhere stored in the metadata. At least I can’t find it in the metadata.
Maybe there is something written in the documents: https://sentinel.esa.int/web/sentinel/user-guides/sentinel-2-msi
It could also be that the application (Arcmap?) you are using is adding this information on the fly.

1 Like

Cloud_Coverage_Assessment element in MTD_MSIL1C.xml (used to be called METADATA.xml in earlier product versions?)

You can harvest just those MTD_MSIL1C.xml with the batch scripts, run it through an XML parser and extract image ID and Cloud_Coverage_Assessment>.

1 Like

Thanks, now I found it in SNAP too.

1 Like

with the -n option, you do not do the download

1 Like

I’ve been using
sentinelsat -u username. -p password -s 20150101 -e 20171211 -d --sentinel 2 -g map.geojson --cloud 70 --url “https://scihub.copernicus.eu/dhus

Which element I should add to not download the image? Thanks

Would you please modify the following by only producing MTD_MSIL1C.xml ?

sentinelsat -u username. -p password -s 20150101 -e 20171211 -d --sentinel 2 -g map.geojson --cloud 70 --url “https://scihub.copernicus.eu/dhus

Hi SARuser,
this is not the tool I advised (mine is sentinel_download.py (the link is above)). It seems you used sentinelsat, which is probably great, but that I do not know.

1 Like

If you need a subset of product metadata use the scihub osearch rest API, its fast, queryable by time, footprint etc. and returns sufficient metadata in json encoding including also cloudcoverage, e.g.
{“name”:“cloudcoverpercentage”,“content”:“79.8487”}

1 Like

hello thanks for information
for me i downlaod sentinel 2 LEVEL 1C in 2015 with 6G VOLUME AND Grouped tiles but i want to know just clood coverage pourcent about specific tile so how i do that and how i open MTD_MSIL2A.xml

For each granule, you have one folder within the Granule directory.
There you find a file named like S2A_OPER_MTD_L1C_TL_MPS__20160528T125136_A004866_T34SGB.xml. In newer versions of S2 L1C data, this file is named MTD_TL.xml
This file can be read by a text editor or can be parsed by software. Within this file, you will find a section like the following.

<n1:Quality_Indicators_Info metadataLevel="Standard">
<Image_Content_QI>
  <CLOUDY_PIXEL_PERCENTAGE>15.1933</CLOUDY_PIXEL_PERCENTAGE>
  <DEGRADED_MSI_DATA_PERCENTAGE>0</DEGRADED_MSI_DATA_PERCENTAGE>
</Image_Content_QI>

The file MTD_MSIL2A.xml does not exist in L1C data. The equivalent of it in 2015 L1C data looks like S2A_OPER_MTD_SAFL1C_PDMC_20160528T200856_R007_V20160528T090826_20160528T090826.xml.

thanks marbet for information
if you can explain to me this message when im using sen2COR ON A SPECIFIC TILE dated in 2915 EXTRAXTED from folder image tat containes several tiles

Screenshot%20-%2001_10_2018%20%2C%2022_21_02