Output format

Now when do subset or clip raster using shapefile in Snap:
1- the output is one file like the input.
2- the bands are (VH and HH)

But when do clip raster using shapefile in arcgis:
1- the output is five files.
2- the bands taken numbers from 1 to 6.

When i use the ArcGIS, Can i have: (1) the output is one file like the input (2) the bands are (VH and HH)


the difference is that SNAP writes a GeoTiff (*.tif) which stores information on the geocoding.

ArcMap writes a common Tiff file (also ending with *.tif) which stores the geocoding separately in the tfw file. The ovr file is created by ArcMap to allow faster image display while zooming and the xml files store the current color coding. This is explained in detail here: https://desktop.arcgis.com/en/arcmap/10.3/manage-data/raster-and-images/auxiliary-files.htm

So basically, the tif files contain the same raster information, but they are differently organized regarding their metadata.