Random Forest result

Hello,

i have problems to interpret my Random-Forest result:
RandomForest classifier newClassifier

Cross Validation
Number of classes = 2
class 0.0: 50percent
accuracy = 0.9971 precision = 0.9600 correlation = 0.9783 errorRate = 0.0029
TruePositives = 48.0000 FalsePositives = 2.0000 TrueNegatives = 638.0000 FalseNegatives = 0.0000
class 1.0: geo_50_percent_Polygon
accuracy = 0.9971 precision = 1.0000 correlation = 0.9783 errorRate = 0.0029
TruePositives = 638.0000 FalsePositives = 0.0000 TrueNegatives = 48.0000 FalseNegatives = 2.0000

Using Testing dataset, % correct predictions = 99.7093
Total samples = 1376
RMSE = 0.053916386601719206
Bias = -0.0029069767441860517

Distribution:
class 0.0: 50percent 96 (6.9767%)
class 1.0: geo_50_percent_Polygon 1280 (93.0233%)

Testing feature importance score:
Each feature is perturbed 3 times and the % correct predictions are averaged
The importance score is the original % correct prediction - average
rank 1 feature 1 : multitemporal score: tp=0.2374 accuracy=0.2374 precision=0.4752 correlation=0.7590 errorRate=-0.2374 cost=-5.0326 GainRatio = 0.1709

I get good results for the classes but bad results for the whole image.

The accuracy from 0.2374 is verry low.

Can someone explain the result?

Information about the metrics are nicely compiled here:
Accuracy/Precision/Correlation/TruePositive/FalsePositive/TrueNegative/FalseNegative: https://en.wikipedia.org/wiki/Sensitivity_and_specificity

RMSE, Bias and MAE: https://medium.com/human-in-a-machine-world/mae-and-rmse-which-metric-is-better-e60ac3bde13d

Low accuracy can have many reasons:

  • Not enough training samples to effectively make use of the randomization (comments here)
  • Training samples which are not representative for the classes
  • Inhomogenous training samples (one solution here)
  • Not enough feature rasters (comments here)
  • Not enough trees (comments here)
1 Like