How can we minimize elevation error between 2 DEMs in an urban area with tall buildings?

The problem

We are trying to compare multiple InSAR DEMs (derived from TerraSAR-X images) to identify topographic changes and measure volumetric change.

Before subtracting 2 InSAR DEMs they must be calibrated so that they represent the same absolute elevation on areas that did not change. We’re struggling with this calibration step; e.g., we are having issues with tall buildings that have very different height in each InSAR DEM even after calibration (sometimes, positive height Vs negative height).

The data

We have computed InSAR DEMs from coherentT TSX image pairs for an urban area in the middle east. We then subtracted 2 InSAR DEMs to see where topography has changed and be able to measure volumetric change.

We selected the images with the smallest temporal baseline and largest perpendicular baseline. The DEMs were created using ESA’s SNAP toolbox following these steps:

  • coregistration
  • interferogram computation
  • Goldstein phase filtering
  • multilooking
  • phase unwrapping (using snaphu + masking low coherence pixels)
  • phase to elevation (using SRTM 1 arcsec)
  • Range-Doppler Terrain Correction (using SRTM 1 arcsec)

NB: the same analysis was done with Iceye data as well. We observed similar calibration issues.

Tested methods

Before comparing 2 InSAR DEMs we aligned them, and calibrated them using SRTM DEM as ground truth. We tried different methods for the calibration:

  • method 1:Computed the error between the InSAR DEM and the SRTM DEM Fitted a plane to estimate the error (tried linear and quadratic fitting) Removed the estimated error from the InSAR DEM
  • method 2used the 3D coregistration methods of the xDEM python library NB: We tested all the 3D coregistration methods of that library

Both methods gave similar results.


After calibration we differentiated the 2 InSAR DEMs. To assess the quality of the calibration we computed the RMSE of this difference. We basically expect the values of the difference to be close to 0 in most parts of the image except where terrain have changed (e.g., construction site).

Below is a InSAR DEM difference before (left) and after (right) calibration.


  • The InSAR DEMs used above were derived from TSX images
  • Empty pixels correspond to areas where the InSAR coherence was too low to trust the estimated elevation

Conclusion and remaining challenge

With the calibration applied we were able to bring down the RMSE of InSAR DEM difference from ~10m to ~6m.

However, we keep having issues with tall buildings, as well as and roads (maybe due to moving cars that corrupt the phase).

In the images below we have a tall building where the InSAR DEM difference remains above 15m (even after calibration).

These are zooms of the same images presented in the ‘result’ section

Possible cause

  • Phase unwrapping is difficult for urban areas with tall buildings
  • Layover, foreshortening and foreshadowing result in inaccurate height estimations

Anyone has any ideas what we could try next to solve this issue?

Few ideas we haven’t tried yet:

  • Mask out all tall buildings to avoid the problem (this assume to have a good building mask…)
  • Take an area with no or very little elevation change (desert area, or other stable no urban area) and redo DEM subtraction (to at least validate our calibration approach)

I don’t think DEM-generation can be made to work well in cities with tall buildings for the reasons you mention. It would be best to mask them out and do the assessment elsewhere as you suggest.

thanks @mengdahl for your response

@falahfakhri, any suggestions?

First of all, I advise reading this post’s details through to the finish.

The primary thing I want to emphasize is that the outcome shouldn’t be compared to SRTM.

According to my understanding, the answer is:

“We selected the images with the smallest temporal baseline and largest perpendicular baseline.”

The absolute opened selection is not the largest because it may also introduce distortion.

I’d suggest the following:

1- Choose different range of ! 200 m < PB < 500 m.
2- Two pairs are not enough to for comparison task.
3- Both pairs should have the same direction Ace. Or/And Desc. (both).
4- Both pairs should have as closet angles as possible (inci).

Please have a look at the following terms:

“When you void vegetation and man-made features from elevation data, you generate a DEM. A bare-earth elevation model is particularly useful in hydrology, soils, and land use planning”

" But height can come from the top of buildings, tree canopy, powerlines, and other features. A DSM captures the natural and built features on the Earth’s surface."

" (DEM) A digital elevation model is a bare-earth raster grid referenced to a vertical datum. When you filter out non-ground points such as bridges and roads, you get a smooth digital elevation model. The built (power lines, buildings, and towers) and natural (trees and other types of vegetation) aren’t included in a DEM."

Your attempt to obtain the DSM, or the right building height, via subtraction of DEM is where the error is coming from. Building heights are determined by DSM, not DEM.

In some countries, a DTM is actually synonymous with a DEM. This means that a DTM is simply an elevation surface representing the bare earth referenced to a common vertical datum.
In the United States and other countries, a DTM has a slightly different meaning. A DTM is a vector data set composed of regularly spaced points and natural features such as ridges and breaklines. A DTM augments a DEM by including linear features of the bare-earth terrain.

Note: Car movement doesn’t have that affects as you mentioned.

For the unwrapping process, Snaphu is still not too sophisticated, which could be the cause of the errors.

For more reference regarding the terminology:

Hope this helps.

Dear all, I have a DEM product made according to SNAP tutorials.
How do I perform elevation correction?
Can I use ground points obtained through a GNSS device?
I have little experience with the math calculator…
It’s nice to consider more options.
Some tutorials please.
Thank you