by Nikolai V. ShokhirevABC Tutorials
In science and technology systems (objects) are characterized by a finite set of parameters: xi , i = 1, ... , N. Consequently, the measurements for these parameters can be arranged into a rectangular matrix:
Here M is the number of experiments. Each experiment corresponds to the measurement of a system , sample, individual, etc. All such terms are used interchangeably.
|Human individuals||Age, sex, education, income, weight, height, etc.|
|Chemical solutions||Spectral intensities at selected wavelength|
|Microchips in a control sample||Voltage and current at certain pins|
|Clinical test participants||Lab test results|
The sample (population) mean vector of parameters is defined as:
For each measurement the vector of deviations can be defined as:
In the case of clinical research, one of the components of μ is an average patient temperature in a hospital. Obviously, more interesting is a deviation from this average.
The vectors of deviations form the matrix D similar to the initial matrix X:
The sample covariance matrix is defined as averaged products of the deviation vector components:
Here di,m is the i-th parameter of the m-th system.
Eq (5) can be rewritten in the following matrix form:
The superscript "T" denotes the matrix transposition.
The maximum likelihood covariance matrix CML differs by the factor M /(M-1) from the above definition:
The advantage of this definition is that the i-th diagonal element is the estimation for the variances of the i-th parameter:
Regardless of the covariance definition, the correlation coefficients are:
or in a matrix form:
Here is a diagonal matrix with the following matrix elements:
The correlation coefficient is a measure of the quality of a linear least squares fit for the original data. A higher σ value means a better linear fit.
This approach is implemented in a program called "Correlations". This program is available in the Download section. You can also use more general "Stat Analysis" program.
Remark: In "Correlations" the meaning of the columns and rows is opposite to that of the tutorial.
Up: ABC Stat
- Multivariate correlations
- Principal Component Analysis
- Panel Data Analysis
©Nikolai V. Shokhirev, 2001-2008