Principal Component analyses

I can not get PCA to finish processing. I’ve let it run through the night.
I have resampled the files to 10 before processing. I’m using the latest SNAP
The files are S2A -MSI-L1C.

how many bands were used as input for the PCA?

I’ve tried from 2 bands, to all bands in many different combinations. Also get same
result with Classification, supervised and unsupervised. Sometimes in classification I get
a solid color or stair like color images corresponding to the number of vectors I used.
But 95% of the time its a blank image. My vectors always have at least 10,000 pixels each and have 7 or more vectors. Frustrating
Thanks for any help

A PCA does not need any vectors, could you please clarify?

yes your right about the PCA. I thought I would explain the other image problems I am having
with classification to maybe help figure out what I am doing wrong.

so do you get an error or does it just take very long?

You should try if it works on a very small subset first. It’s not uncommon that PCAs take tremendous time, especially for large rasters and many features.

no I do not get errors

This is the report I get with small subset.

RandomForest classifier newClassifier

Cross Validation
Number of classes = 4
class 0.0: WATER
accuracy = 0.5580 precision = 0.2600 correlation = 0.4263 errorRate = 0.4420
TruePositives = 520.0000 FalsePositives = 1480.0000 TrueNegatives = 2270.0000 FalseNegatives = 730.0000
class 1.0: GRASS
accuracy = 0.6452 precision = 0.2380 correlation = 0.3423 errorRate = 0.3548
TruePositives = 238.0000 FalsePositives = 762.0000 TrueNegatives = 2988.0000 FalseNegatives = 1012.0000
class 2.0: LAVA ROCK
accuracy = 0.6552 precision = 0.2630 correlation = 0.3510 errorRate = 0.3448
TruePositives = 263.0000 FalsePositives = 737.0000 TrueNegatives = 3013.0000 FalseNegatives = 987.0000
class 3.0: IRON RED
accuracy = 0.6444 precision = 0.2360 correlation = 0.3417 errorRate = 0.3556
TruePositives = 236.0000 FalsePositives = 764.0000 TrueNegatives = 2986.0000 FalseNegatives = 1014.0000

Using Testing dataset, % correct predictions = 25.1400
Total samples = 10000
RMSE = 1.6454786537661312
Bias = -0.30000000000000004

Distribution:
class 0.0: WATER 2500 (25.0000%)
class 1.0: GRASS 2500 (25.0000%)
class 2.0: LAVA ROCK 2500 (25.0000%)
class 3.0: IRON RED 2500 (25.0000%)

Testing feature importance score:
Each feature is perturbed 3 times and the correct predictions are averaged The importance score is the original correct prediction - average
rank 1 feature 13 : B12 score: tp=0.0000 accuracy=0.0000 precision=0.0000 correlation=0.0000 errorRate=0.0000 cost=0.0000 GainRatio = 0.0000
Warning: rank <= featureBandList.length

does the result look alright? If so, it is simply a matter of file size and computing time. Maybe you can switch to a computer with higher capabilities for this task.

NO. There is no image displayed

can you please give the ID of the product you are using?

Software 7.0.3
S2A_MSIL2A_20190627T180921_N0212_R084_T12SUH_20190628T002159_resampled

I downloaded your product, resampled it, made a subset and then applied PCA (3 components). It is important that you actively select the reflectance bands as only inputs:

Otherwise you will impute the patterns of the mask bands and quality indicators into the PCA.

1 Like

All good.
Thanks so much for your help

did you solve your problem? If yes, how?