Informativeness of Correlated Features

It should be clear by now, that we are not looking for features that are necessarily orthogonal and uncorrelated, but for those that will give the rise to such localization of classes in feature space that we can classify objects belonging to different classes with low error rate. Knowing that each class is given by its probability distribution in feature space, we, actually, want to find some kind of measure that will be valid indicator of how far classes are from each other, counting in the shape of class distributions. If we think more deeply, then we can come up with the following: the better separation of classes, the greater distance between them in feature space, and, hence, the greater informativeness of our features is (regarding classification procedure). Then, we can combine all in one, and introduce the measure that will comprise class distribution (shape and geometry, meaning- the kind of distribution, its nature, and values of its parameters), distance, and informativeness. And, here it is (at least one of the kind, used very often)...

<< >>