The area apparently has less classes than 10, this could be one reason, also the distribution of training area could represents different pixels of the same class considered by you, the other issue from your screenshot, Did you re-sample the S2 image to one band resolution? , Did you subset the unwanted bands?
Take a look at this tutorial
And this one
Deforestation monitoring with S1
Then please take a look at,
And also this topic is important to you,
Classification based on spectral library
This thread is discussed in here,