calculates the correlation matrix from pairwise correlations.
[...] = nan_corrcoef(X); calculates the (auto-)correlation matrix of X [...] = nan_corrcoef(X,Y); calculates the crosscorrelation between X and Y [...] = nan_corrcoef(..., Mode); [...] = nan_corrcoef(..., param1, value1, param2, value2, ... ); [R,p,ci1,ci2,nansig] = nan_corrcoef(...);
gives the correlation coefficient
gives 'Spearman''s Rank Correlation Coefficient'
gives a nonparametric Rank Correlation Coefficient
type of correlation
how do deal with missing values encoded as NaN's.
remove all rows with at least one NaN
[default]
significance level to compute confidence interval [default = 0.01]
is the correlation matrix
is the correlation coefficient r between X(:,i) and Y(:,j)
gives the significance of R
do not reject the Null hypothesis: 'R is zero'.
The alternative hypothesis 'R is larger than zero' is true with probability (1-alpha).
lower (1-alpha) confidence interval
upper (1-alpha) confidence interval
p-value whether H0: 'NaN''s are not correlated' could be correct
The input data can contain missing values encoded with NaN. Missing data (NaN's) are handled by pairwise deletion [15]. In order to avoid possible pitfalls, use case-wise deletion or or check the correlation of NaN's with your data (see below). A significance test for testing the Hypothesis 'correlation coefficient R is significantly different to zero' is included.
The result is only valid if the occurence of NaN's is uncorrelated. In order to avoid this pitfall, the correlation of NaN's should be checked or case-wise deletion should be applied. Case-Wise deletion can be implemented ix = ~or(isnan([X,Y]),2); [...] = CORRCOEF(X(ix,:),Y(ix,:),...);
Correlation (non-random distribution) of NaN's can be checked with [nan_R,nan_sig]=nan_corrcoef(X,isnan(X)) or [nan_R,nan_sig]=nan_corrcoef([X,Y],isnan([X,Y])) or [R,p,ci1,ci2] = CORRCOEF(...);
Further recommandation related to the correlation coefficient: + LOOK AT THE SCATTERPLOTS to make sure that the relationship is linear + Correlation is not causation because it is not clear which parameter is 'cause' and which is 'effect' and the observed correlation between two variables might be due to the action of other, unobserved variables.
on the correlation coefficient
[ 1] http://mathworld.wolfram.com/CorrelationCoefficient.html
[ 2] http://www.geography.btinternet.co.uk/spearman.htm
[ 3] Hogg, R. V. and Craig, A. T. Introduction to Mathematical Statistics, 5th ed. New York: Macmillan, pp. 338 and 400, 1995.
[ 4] Lehmann, E. L. and D'Abrera, H. J. M. Nonparametrics: Statistical Methods Based on Ranks, rev. ed. Englewood Cliffs, NJ: Prentice-Hall, pp. 292, 300, and 323, 1998.
[ 5] Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; and Vetterling, W. T. Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd ed. Cambridge, England: Cambridge University Press, pp. 634-637, 1992
[ 6] http://mathworld.wolfram.com/SpearmanRankCorrelationCoefficient.html
on the significance test of the correlation coefficient
[11] http://www.met.rdg.ac.uk/cag/STATS/corr.html
[12] http://www.janda.org/c10/Lectures/topic06/L24-significanceR.htm
[13] http://faculty.vassar.edu/lowry/ch4apx.html
[14] http://davidmlane.com/hyperstat/B134689.html
[15] http://www.statsoft.com/textbook/stbasic.html//Correlations
others
[20] http://www.tufts.edu/~gdallal/corr.htm