Quadratic and Linear Discriminant Analysis

We often face the task of pattern classification: deciding which of several mutually exclusive classes to place a set of data points. For example, in medical diagnostics, we might wish to know whether the vital signs of some patient indicate one of the two classes of health or disease.

Whilst neural networks have become popular methods for doing this, they have the disadvantage that they often require complex training algorithms with unknown properties. For many simpler classification problems where the two classes can be separated by a simple decision boundary then quadratic and linear discriminant analysis are sometimes preferable. These techniques require no "training" and the parameters for the decision boundary can be estimated simply by computing straightforward statistical quantities such as means and covariances [1]. This is one example of the emerging overlap between the disciplines of artificial intelligence and classical statistical methods.

LDA can only separate the classes with straight lines (in two dimensions) or, more generally, (hyper)-planes in higher dimensions. QDA is more sophisticated in that it can separate classes with circles, lines, ellipses, parabolas or hyperbolas (and their equivalent higher-dimensional objects). The figure on the right shows the separation of the black and blue data points using QDA (left panel) and LDA (right panel) into their appropriate classes. The separation is not perfect but perfect separation may in fact not be desireable in reality where there is uncertainty in the positions of each data point. It is better to have a decision boundary that ignores the noisy aspects of the data to concentrate on the underlying classes. The ability of the classifier to ignore the noise on the data and find the actual, underlying classes can be tested using, for example, bootstrap resampling [1-2].

We have used LDA and QDA in biomedical applications to separate patients with cardiovascular disease and speech disorders from normal control subjects [2]. See the software pages for code that implements these ideas.


[1] C.M. Bishop (1995), Neural Networks for Pattern Recognition. Oxford, New York: Clarendon Press; Oxford University Press. 482.

[2] M. Little, P. McSharry, I. Moroz, S. Roberts (2006), Nonlinear, Biophysically-Informed Speech Pathology Detection in 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings.: Toulouse, France. pp. II-1080-II-1083.