相关分析：修订间差异

2024年1月22日 (一) 01:05的最新版本

几组（x, y）点，以及每组的皮尔逊相关系数。这些相关性反映了线性关系的噪声性和方向（顶部行），但不包括该关系的斜率（中间），以及非线性关系的许多方面（底部）。注意：中间的图形斜率为0，但在这种情况下，相关系数是未定义的，因为Y的方差为零。

在统计学中，相关性或依赖性是指两个随机变量或双变量数据之间的任何统计关系，无论其是否因果。尽管从广义上讲，“相关性”可能指任何类型的关联，在统计学中，它通常指的是一对变量线性地相关的程度。熟悉的相关现象示例包括父母身高与其后代之间的相关性，以及商品价格与消费者愿意购买数量之间的相关性，正如所谓的需求曲线所描述的。

相关性之所以有用，是因为它们可以指示出可以在实践中利用的预测关系。例如，电力公司可能在温和的天气下根据电力需求和天气之间的相关性减少发电量。在这个例子中，存在一个因果关系，因为极端天气导致人们使用更多的电力进行取暖或制冷。然而，一般而言，存在相关性并不足以推断出存在因果关系（即相关性并不意味着因果性）。

正式地说，如果随机变量不满足概率独立性的数学特性，它们就是依赖的。在非正式用语中，相关性与依赖性是同义的。然而，在技术意义上使用时，相关性指的是测试变量及其各自期望值之间的几种特定类型的数学运算之一。本质上，相关性是衡量两个或多个变量之间关系的度量。有几种相关系数，通常用[math]\rho[/math]或[math]r[/math]表示，用以衡量相关程度。其中最常见的是皮尔逊相关系数，它只对两个变量之间的线性关系敏感（即使当其中一个变量是另一个的非线性函数时，这种关系也可能存在）。其他的相关系数——如斯皮尔曼等级相关——已被开发出来，比皮尔逊的更加健壮，即对非线性关系更敏感。^[1]^[2]^[3] 互信息也可以用来衡量两个变量之间的依赖性。

皮尔逊积矩相关系数

各种数据集的散点图示例，展示了不同的相关系数。

两个量之间依赖关系的最常见度量是Pearson product-moment correlation coefficient（PPMCC），即“皮尔逊相关系数”，通常简称为“相关系数”。它通过取我们数值数据集中所讨论的两个变量的协方差与它们方差的平方根的比值获得。数学上，简单地将两个变量的covariance除以它们standard deviation的乘积。Karl Pearson从Francis Galton的一个类似但略有不同的想法中发展出了这个系数。^[4]

皮尔逊积矩相关系数试图通过在两个变量的数据集中建立最佳拟合线，基本上展示了预期值，而所得到的皮尔逊相关系数表明实际数据集与预期值的偏离程度。根据我们的皮尔逊相关系数的符号，如果数据集中的变量之间存在某种关系，我们可能会得到负相关或正相关。, November 2023 {{citation}}: Cite has empty unknown parameters: |cat2=, |cat-date2=, |cat3=, and |cat-date3= (help); Missing or empty |title= (help); Unknown parameter |cat-date= ignored (help); Unknown parameter |cat= ignored (help)^{[citation needed]}

两个random variables [math]X[/math] 和 [math]Y[/math] 之间的总体相关系数 [math]\rho_{X,Y}[/math]，它们具有expected values [math]\mu_X[/math] 和 [math]\mu_Y[/math] 以及standard deviations [math]\sigma_X[/math] 和 [math]\sigma_Y[/math]，定义如下：

[math]\rho_{X,Y} = \operatorname{corr}(X,Y) = {\operatorname{cov}(X,Y) \over \sigma_X \sigma_Y} = {\operatorname{E}[(X-\mu_X)(Y-\mu_Y)] \over \sigma_X\sigma_Y}, \quad \text{if}\ \sigma_{X}\sigma_{Y}>0.[/math]

其中 [math]\operatorname{E}[/math] 是expected value运算符，[math]\operatorname{cov}[/math] 表示covariance，[math]\operatorname{corr}[/math] 是相关系数的广泛使用的替代符号。只有当两个标准差都是有限且正数时，皮尔逊相关才有定义。纯粹用moments表示的替代公式是：

[math]\rho_{X,Y} = {\operatorname{E}(XY)-\operatorname{E}(X)\operatorname{E}(Y)\over \sqrt{\operatorname{E}(X^2)-\operatorname{E}(X)^2}\cdot \sqrt{\operatorname{E}(Y^2)-\operatorname{E}(Y)^2} }[/math]

皮尔逊积矩相关系数

各种数据集的散点图示例，展示了不同的相关系数。

当两个变量是独立的时，皮尔逊相关系数为0，但反之不一定成立，因为相关系数只检测两个变量之间的线性依赖。简单来说，如果两个随机变量X和Y是独立的，那么它们是不相关的；但如果两个随机变量不相关，它们可能是独立的，也可能不是。

[math]\begin{align} X,Y \text{ 独立} \quad & \Rightarrow \quad \rho_{X,Y} = 0 \quad (X,Y \text{ 不相关})\\ \rho_{X,Y} = 0 \quad (X,Y \text{ 不相关})\quad & \nRightarrow \quad X,Y \text{ 独立} \end{align}[/math]

例如，假设随机变量 [math]X[/math] 关于零对称分布，并且 [math]Y=X^2[/math]。那么 [math]Y[/math] 完全由 [math]X[/math] 决定，因此 [math]X[/math] 和 [math]Y[/math] 是完全依赖的，但它们的相关性为零；它们是不相关的。然而，在特殊情况下，当 [math]X[/math] 和 [math]Y[/math] 是联合正态分布时，不相关性等同于独立性。

尽管不相关的数据不一定意味着独立性，但如果随机变量的互信息为0，可以检查它们是否独立。

样本相关系数

对于一组 [math]n[/math] 次测量的对 [math](X_i,Y_i)[/math]（由 [math]i=1,\ldots,n[/math] 索引），可以使用样本相关系数来估计 [math]X[/math] 和 [math]Y[/math] 之间的总体皮尔逊相关系数 [math]\rho_{X,Y}[/math]。样本相关系数定义如下：

[math]$ r_{x y} \stackrel{\text { def }}{=} \frac{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{(n-1) s_{x} s_{y}}=\frac{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{\sqrt{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2} \sum_{i=1}^{n}\left(y_{i}-\bar{y}\right)^{2}}} $,[/math]

其中 [math]\overline{x}[/math] 和 [math]\overline{y}[/math] 分别是 [math]X[/math] 和 [math]Y[/math] 的样本算术平均值，[math]s_x[/math] 和 [math]s_y[/math] 是 [math]X[/math] 和 [math]Y[/math] 的校正样本标准差。

[math]r_{xy}[/math] 的等效表达式是： [math]$ \begin{aligned} r_{x y} & =\frac{\sum x_{i} y_{i}-n \bar{x} \bar{y}}{n s_{x}^{\prime} s_{y}^{\prime}} \\ & =\frac{n \sum x_{i} y_{i}-\sum x_{i} \sum y_{i}}{\sqrt{n \sum x_{i}^{2}-\left(\sum x_{i}\right)^{2}} \sqrt{n \sum y_{i}^{2}-\left(\sum y_{i}\right)^{2}}} .\end{aligned} $[/math] 其中 [math]s'_x[/math] 和 [math]s'_y[/math] 分别是 [math]X[/math] 和 [math]Y[/math] 的未校正样本标准差。

如果 [math]x[/math] 和 [math]y[/math] 是包含测量误差的测量结果，相关系数的实际限制不是−1到+1，而是更小的范围。^[6] 对于单独自变量的线性模型，决定系数（R平方）是 [math]r_{xy}[/math]，即皮尔逊积矩系数的平方。

示例

考虑下表中给出的 $X$ 和 $Y$ 的联合概率分布。

[math]\mathrm{P}(X=x,Y=y)[/math]
$y$ $x$	−1	0	1
0	0	1/3	0
1	1/3	0	1/3

联合分布的边缘分布

针对这个联合分布，其边缘分布如下所示：

[math]\mathrm{P}(X=x)=\left\{\begin{array}{ll}\frac{1}{3} & \text { for } x=0 \\ \frac{2}{3} & \text { for } x=1\end{array}\right.[/math]

[math]\mathrm{P}(Y=y)=\left\{\begin{array}{ll}\frac{1}{3} & \text { for } y=-1 \\ \frac{1}{3} & \text { for } y=0 \\ \frac{1}{3} & \text { for } y=1\end{array}\right.[/math]

这导致了以下期望值和方差： [math]\mu_X = \frac 2 3[/math] [math]\mu_Y = 0[/math] [math]\sigma_X^2 = \frac 2 9[/math] [math]\sigma_Y^2 = \frac 2 3[/math]

因此：

[math]\begin{aligned} \rho_{X, Y} & =\frac{1}{\sigma_{X} \sigma_{Y}} \mathrm{E}\left[\left(X-\mu_{X}\right)\left(Y-\mu_{Y}\right)\right] \\ & =\frac{1}{\sigma_{X} \sigma_{Y}} \sum_{x, y}\left(x-\mu_{X}\right)\left(y-\mu_{Y}\right) \mathrm{P}(X=x, Y=y) \\ & =\left(1-\frac{2}{3}\right)(-1-0) \frac{1}{3}+\left(0-\frac{2}{3}\right)(0-0) \frac{1}{3}+\left(1-\frac{2}{3}\right)(1-0) \frac{1}{3}=0\end{aligned}[/math]

秩相关系数

像Spearman's rank correlation coefficient（斯皮尔曼等级相关系数）和Kendall的秩相关系数 (τ)这样的秩相关系数，用来衡量随着一个变量增加，另一个变量也倾向于增加的程度，而不要求这种增加必须通过线性关系来表示。如果当一个变量增加时，另一个变量减少，秩相关系数将是负的。通常认为这些秩相关系数是皮尔逊系数的替代品，用于减少计算量或使系数对分布中的非正态性不那么敏感。然而，这种观点在数学上并无太多依据，因为秩相关系数衡量的是与Pearson product-moment correlation coefficient（皮尔逊积矩相关系数）不同类型的关系，并且最好被视为不同类型的关联度量，而不是人口相关系数的另一种度量。^[7]^[8]

为了说明秩相关的性质及其与线性相关的区别，考虑以下四对数字 [math](x,y)[/math]：

(0, 1), (10, 100), (101, 500), (102, 2000)。

当我们从一对数到另一对数时，[math]x[/math]增加，[math]y[/math]也增加。这种关系是完美的，意味着[math]x[/math]的增加总是伴随着[math]y[/math]的增加。这意味着我们有一个完美的秩相关，斯皮尔曼和肯德尔的相关系数都是1，而在这个例子中皮尔逊积矩相关系数为0.7544，表明这些点远离直线。同样地，如果[math]y[/math]在[math]x[/math]增加时总是减少，秩相关系数将是-1，而皮尔逊积矩相关系数可能接近-1，也可能不接近，这取决于这些点与直线的接近程度。尽管在完美秩相关的极端情况下，两个系数都是相等的（都是+1或-1），但通常情况下并非如此，因此两个系数的值不能有意义地进行比较。^[7]例如，对于三对数(1, 1) (2, 3) (3, 2)，斯皮尔曼系数是1/2，而肯德尔系数是 1/3。

其他衡量随机变量相依性的方法

仅靠相关系数所提供的信息并不足以定义随机变量之间的依赖结构。^[9] 在某些特殊情况下，例如当分布是multivariate normal distribution时，相关系数完全定义了依赖结构。在elliptical distribution的情况下，它描述了等密度的(超)椭圆；然而，它并不完全描述依赖结构（例如，multivariate t-distribution的自由度决定了尾部依赖的程度）。

Distance correlation^[10]^[11] 被引入是为了弥补Pearson相关系数的不足，即它可能对依赖的随机变量为零；零距离相关性意味着独立。

随机依赖系数^[12] 是一个基于copula的多变量随机变量之间依赖性的计算效率高的衡量方法。RDC对随机变量的非线性缩放不变，能够发现广泛的功能关联模式，并在独立时取值为零。

对于两个二元变量，odds ratio衡量它们的依赖性，并取非负数值范围，可能为无穷大：[mathmath>[/math] 。相关的统计量如Yule's Y和Yule's Q将其规范化到类似相关系数的范围[mathmath>[/math] 。逻辑模型将赔率比推广，用于模拟依赖变量是离散的情况，且可能有一个或多个独立变量。

correlation ratio、基于熵的mutual information、total correlation、dual total correlation和polychoric correlation都能够检测更一般的依赖性，考虑它们之间的copula也是如此，而coefficient of determination将相关系数推广到multiple regression。

对数据分布的敏感性

变量 $X$ 和 $Y$ 之间的依赖程度不取决于这些变量表达的尺度。也就是说，如果我们分析 $X$ 和 $Y$ 之间的关系，大多数相关度量不会受到将 $X$ 转换为[math]a + bX[/math]和 $Y$ 转换为[math]c + dY[/math]的影响，其中a、b、c和d是常数（b和d为正数）。这适用于某些相关统计量以及它们的人口类似物。一些相关统计量，如秩相关系数，也对 $X$ 和/或 $Y$ 的边缘分布的单调变换保持不变。

当

X

和

Y

两个变量的范围不受限制时，以及

X

的范围限制在区间(0,1)时，展示了

X

与

Y

之间的Pearson/Spearman相关系数。

大多数相关度量对于 $X$ 和 $Y$ 的抽样方式是敏感的。如果在更广范围的值上观察，依赖关系往往更强。因此，如果我们考虑所有成年男性父亲与其儿子身高之间的相关系数，并将其与仅选择身高在165厘米到170厘米之间的父亲时计算的同一相关系数进行比较，后者情况下的相关性会更弱。已经开发出几种技术试图纠正一个或两个变量中的范围限制，并且通常在元分析中使用；最常见的是Thorndike的第二和第三案例方程。^[13]

对于 $X$ 和 $Y$ 的某些联合分布，使用中的各种相关度量可能是未定义的。例如，Pearson相关系数是基于矩定义的，因此如果矩是未定义的，它也将是未定义的。基于quantile的依赖度量始终是定义的。旨在估计人口依赖度量的基于样本的统计量可能具有或可能不具有如无偏性或渐近一致性等理想的统计属性，这取决于采样数据的人口空间结构。

对数据分布的敏感性可以被用作优势。例如，scaled correlation旨在利用对范围的敏感性来挑选出时间序列中快速组成部分之间的相关性。^[14]通过以受控方式减少值的范围，过滤掉长时间尺度上的相关性，仅揭示短时间尺度上的相关性。

Anscombe's quartet：四组数据，它们的相关系数均为0.816

皮尔逊相关系数用于表示两个变量之间线性关系的强度，但其值通常无法完全刻画它们的关系。^[21] 特别是，如果给定[math]X[/math]的[math]Y[/math]的条件均值，记作[math]\operatorname{E}(Y \mid X)[/math]，不是[math]X[/math]的线性函数，那么相关系数将无法完全确定[math]\operatorname{E}(Y \mid X)[/math]的形式。

相邻图片展示了Anscombe's quartet的散点图，这是由Francis Anscombe创建的四对不同变量组成的集合。^[22] 这四个[math]y[/math]变量具有相同的平均值（7.5）、方差（4.12）、相关系数（0.816）和回归线（[math]y=3+0.5x[/math]）。然而，如图所示，这些变量的分布非常不同。第一个（左上角）似乎呈正态分布，并符合人们对两个相关变量在正态性假设下的预期。第二个（右上角）不是正态分布；虽然两个变量之间的明显关系可以观察到，但它不是线性的。在这种情况下，皮尔逊相关系数并未表明存在精确的函数关系：只是表明该关系在多大程度上可以被线性关系近似。在第三种情况（左下角）中，线性关系是完美的，除了一个离群值，它的影响足以将相关系数从1降低到0.816。最后，第四个例子（右下角）展示了当一个离群值足以产生高相关系数的另一个例子，尽管两个变量之间的关系不是线性的。

这些例子表明，相关系数作为总结统计量，不能替代对数据的视觉检查。有时认为这些例子表明皮尔逊相关假设数据遵循正态分布，但这只是部分正确。^[4] 皮尔逊相关可以准确地计算任何具有有限协方差矩阵的分布，这包括实际遇到的大多数分布。然而，皮尔逊相关系数（连同样本均值和方差）只是充分统计量，如果数据来自多元正态分布。因此，只有当数据来自多元正态分布时，皮尔逊相关系数才能完全刻画变量之间的关系。

双变量正态分布

如果一对随机变量[math]\ (X,Y)\ [/math]遵循双变量正态分布，则给定[math]Y[/math]的条件下[math]X[/math]的条件均值[math]\mathcal{E}(X \mid Y)[/math]是[math]Y[/math]的线性函数，而给定[math]X[/math]的条件下[math]Y[/math]的条件均值[math]\mathcal{E}(Y \mid X)[/math]是[math]X[/math]的线性函数。[math]X[/math]和[math]Y[/math]之间的相关系数[math]\ \rho_{X,Y}\ [/math]，以及[math]X[/math]和[math]Y[/math]的边缘均值和方差确定了这种线性关系：

[math]\mathcal{E}(Y \mid X)=\mathcal{E}(Y)+\rho_{X, Y} \cdot \sigma_{Y} \cdot \frac{X-\mathcal{E}(X)}{\sigma_{X}}[/math]

其中[math]\mathcal{E}(X)[/math]和[math]\mathcal{E}(Y)[/math]分别是[math]X[/math]和[math]Y[/math]的期望值，[math]\sigma_X[/math]和[math]\sigma_Y[/math]分别是[math]X[/math]和[math]Y[/math]的标准差。

经验相关系数[math]r[/math]是相关系数[math]\rho[/math]的估计值。对[math]\rho[/math]的分布估计由以下公式给出：

[math]\pi(\rho \mid r)=\frac{\Gamma(N)}{\sqrt{2 \pi} \cdot \Gamma\left(N-\frac{1}{2}\right)} \cdot\left(1-r^{2}\right)^{\frac{N-2}{2}} \cdot\left(1-\rho^{2}\right)^{\frac{N-3}{2}} \cdot(1-r \rho)^{-N+\frac{3}{2}} \cdot F_{\text {Нур }}\left(\frac{3}{2},-\frac{1}{2} ; N-\frac{1}{2} ; \frac{1+r \rho}{2}\right)[/math]

其中[math]F_\mathsf{Hyp}[/math]是高斯超几何函数。

这个密度既是贝叶斯后验密度，也是精确的最佳置信度分布密度。^[23]^[24]

另请参阅

引用

↑ Croxton, Frederick Emory; Cowden, Dudley Johnstone; Klein, Sidney (1968) Applied General Statistics, Pitman. ISBN 9780273403159 (page 625)
↑ Dietrich, Cornelius Frank (1991) Uncertainty, Calibration and Probability: The Statistics of Scientific and Industrial Measurement 2nd Edition, A. Higler. ISBN 9780750300605 (Page 331)
↑ Aitken, Alexander Craig (1957) Statistical Mathematics 8th Edition. Oliver & Boyd. ISBN 9780050013007 (Page 95)
↑ ^4.0 ^4.1 Rodgers, J. L.; Nicewander, W. A. (1988). "Thirteen ways to look at the correlation coefficient". The American Statistician. 42 (1): 59–66. doi:10.1080/00031305.1988.10475524. JSTOR 2685263.
↑ Dowdy, S. and Wearden, S. (1983). "Statistics for Research", Wiley. ISBN 0-471-08602-9 pp 230
↑ Francis, DP; Coats AJ; Gibson D (1999). "How high can a correlation coefficient be?". Int J Cardiol. 69 (2): 185–199. doi:10.1016/S0167-5273(99)00028-5. PMID 10549842.
↑ ^7.0 ^7.1 Yule, G.U and Kendall, M.G. (1950), "An Introduction to the Theory of Statistics", 14th Edition (5th Impression 1968). Charles Griffin & Co. pp 258–270
↑ Kendall, M. G. (1955) "Rank Correlation Methods", Charles Griffin & Co.
↑ Mahdavi Damghani B. (2013). "The Non-Misleading Value of Inferred Correlation: An Introduction to the Cointelation Model". Wilmott Magazine. 2013 (67): 50–61. doi:10.1002/wilm.10252.
↑ Székely, G. J. Rizzo; Bakirov, N. K. (2007). "Measuring and testing independence by correlation of distances". Annals of Statistics. 35 (6): 2769–2794. arXiv:0803.4101. doi:10.1214/009053607000000505. S2CID 5661488.
↑ Székely, G. J.; Rizzo, M. L. (2009). "Brownian distance covariance". Annals of Applied Statistics. 3 (4): 1233–1303. arXiv:1010.0297. doi:10.1214/09-AOAS312. PMC 2889501. PMID 20574547.
↑ Lopez-Paz D. and Hennig P. and Schölkopf B. (2013). "The Randomized Dependence Coefficient", "Conference on Neural Information Processing Systems" [ Reprint]
↑ Thorndike, Robert Ladd (1947). Research problems and techniques (Report No. 3). Washington DC: US Govt. print. off.
↑ Nikolić, D; Muresan, RC; Feng, W; Singer, W (2012). "Scaled correlation analysis: a better way to compute a cross-correlogram". European Journal of Neuroscience. 35 (5): 1–21. doi:10.1111/j.1460-9568.2011.07987.x. PMID 22324876. S2CID 4694570.
↑ Higham, Nicholas J. (2002). "Computing the nearest correlation matrix—a problem from finance". IMA Journal of Numerical Analysis. 22 (3): 329–343. CiteSeerX 10.1.1.661.2180. doi:10.1093/imanum/22.3.329.
↑ "Portfolio Optimizer". portfoliooptimizer.io. Retrieved 2021-01-30.
↑ Borsdorf, Rudiger; Higham, Nicholas J.; Raydan, Marcos (2010). "Computing a Nearest Correlation Matrix with Factor Structure" (PDF). SIAM J. Matrix Anal. Appl. 31 (5): 2603–2622. doi:10.1137/090776718.
↑ Qi, HOUDUO; Sun, DEFENG (2006). "A quadratically convergent Newton method for computing the nearest correlation matrix". SIAM J. Matrix Anal. Appl. 28 (2): 360–385. doi:10.1137/050624509.
↑ Park, Kun Il (2018). Fundamentals of Probability and Stochastic Processes with Applications to Communications. Springer. ISBN 978-3-319-68074-3.
↑ Aldrich, John (1995). "Correlations Genuine and Spurious in Pearson and Yule". Statistical Science. 10 (4): 364–376. doi:10.1214/ss/1177009870. JSTOR 2246135.
↑ Mahdavi Damghani, Babak (2012). "测量相关性的误导性价值". Wilmott Magazine. 2012 (1): 64–73. doi:10.1002/wilm.10167. S2CID 154550363.
↑ Anscombe, Francis J. (1973). "统计分析中的图形". The American Statistician. 27 (1): 17–21. doi:10.2307/2682899. JSTOR 2682899.
↑ Taraldsen, Gunnar (2021). "关于相关性的置信密度". Sankhya A (in English). 85: 600–616. doi:10.1007/s13171-021-00267-y. ISSN 0976-8378. S2CID 244594067.
↑ Taraldsen, Gunnar (2020). 关于相关性的信心. researchgate.net (preprint) (in English). doi:10.13140/RG.2.2.23673.49769.

延伸阅读

Cohen, J.; Cohen P.; West, S.G. & Aiken, L.S. (2002). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Psychology Press. ISBN 978-0-8058-2223-6.
"Correlation (in statistics)", Encyclopedia of Mathematics, EMS Press, 2001 [1994]
Oestreicher, J. & D. R. (February 26, 2015). Plague of Equals: A science thriller of international disease, politics and drug discovery. California: Omega Cat Press. p. 408. ISBN 978-0963175540.

外部链接

MathWorld page on the (cross-)correlation coefficient/s of a sample
Compute significance between two correlations, for the comparison of two correlation values.
"A MATLAB Toolbox for computing Weighted Correlation Coefficients". Archived from the original on 24 April 2021.
Proof that the Sample Bivariate Correlation has limits plus or minus 1
Interactive Flash simulation on the correlation of two normally distributed variables by Juha Puranen.
Correlation analysis. Biomedical Statistics
R-Psychologist Correlation visualization of correlation between two numeric variables

[1] Croxton, Frederick Emory; Cowden, Dudley Johnstone; Klein, Sidney (1968) Applied General Statistics, Pitman. ISBN 9780273403159 (page 625)

[2] Dietrich, Cornelius Frank (1991) Uncertainty, Calibration and Probability: The Statistics of Scientific and Industrial Measurement 2nd Edition, A. Higler. ISBN 9780750300605 (Page 331)

[3] Aitken, Alexander Craig (1957) Statistical Mathematics 8th Edition. Oliver & Boyd. ISBN 9780050013007 (Page 95)

[thirteenways-4] 4.0 ^4.1 Rodgers, J. L.; Nicewander, W. A. (1988). "Thirteen ways to look at the correlation coefficient". The American Statistician. 42 (1): 59–66. doi:10.1080/00031305.1988.10475524. JSTOR 2685263.

[5] Dowdy, S. and Wearden, S. (1983). "Statistics for Research", Wiley. ISBN 0-471-08602-9 pp 230

[6] Francis, DP; Coats AJ; Gibson D (1999). "How high can a correlation coefficient be?". Int J Cardiol. 69 (2): 185–199. doi:10.1016/S0167-5273(99)00028-5. PMID 10549842.

[Yule_and_Kendall-7] 7.0 ^7.1 Yule, G.U and Kendall, M.G. (1950), "An Introduction to the Theory of Statistics", 14th Edition (5th Impression 1968). Charles Griffin & Co. pp 258–270

[Kendall_Rank_Correlation_Methods-8] Kendall, M. G. (1955) "Rank Correlation Methods", Charles Griffin & Co.

[wilmottM.com-9] Mahdavi Damghani B. (2013). "The Non-Misleading Value of Inferred Correlation: An Introduction to the Cointelation Model". Wilmott Magazine. 2013 (67): 50–61. doi:10.1002/wilm.10252.

[10] Székely, G. J. Rizzo; Bakirov, N. K. (2007). "Measuring and testing independence by correlation of distances". Annals of Statistics. 35 (6): 2769–2794. arXiv:0803.4101. doi:10.1214/009053607000000505. S2CID 5661488.

[11] Székely, G. J.; Rizzo, M. L. (2009). "Brownian distance covariance". Annals of Applied Statistics. 3 (4): 1233–1303. arXiv:1010.0297. doi:10.1214/09-AOAS312. PMC 2889501. PMID 20574547.

[12] Lopez-Paz D. and Hennig P. and Schölkopf B. (2013). "The Randomized Dependence Coefficient", "Conference on Neural Information Processing Systems" [ Reprint]

[13] Thorndike, Robert Ladd (1947). Research problems and techniques (Report No. 3). Washington DC: US Govt. print. off.

[Nikolicetal-14] Nikolić, D; Muresan, RC; Feng, W; Singer, W (2012). "Scaled correlation analysis: a better way to compute a cross-correlogram". European Journal of Neuroscience. 35 (5): 1–21. doi:10.1111/j.1460-9568.2011.07987.x. PMID 22324876. S2CID 4694570.

[15] Higham, Nicholas J. (2002). "Computing the nearest correlation matrix—a problem from finance". IMA Journal of Numerical Analysis. 22 (3): 329–343. CiteSeerX 10.1.1.661.2180. doi:10.1093/imanum/22.3.329.

[16] "Portfolio Optimizer". portfoliooptimizer.io. Retrieved 2021-01-30.

[17] Borsdorf, Rudiger; Higham, Nicholas J.; Raydan, Marcos (2010). "Computing a Nearest Correlation Matrix with Factor Structure" (PDF). SIAM J. Matrix Anal. Appl. 31 (5): 2603–2622. doi:10.1137/090776718.

[18] Qi, HOUDUO; Sun, DEFENG (2006). "A quadratically convergent Newton method for computing the nearest correlation matrix". SIAM J. Matrix Anal. Appl. 28 (2): 360–385. doi:10.1137/050624509.

[KunIlPark-19] Park, Kun Il (2018). Fundamentals of Probability and Stochastic Processes with Applications to Communications. Springer. ISBN 978-3-319-68074-3.

[20] Aldrich, John (1995). "Correlations Genuine and Spurious in Pearson and Yule". Statistical Science. 10 (4): 364–376. doi:10.1214/ss/1177009870. JSTOR 2246135.

[21] Mahdavi Damghani, Babak (2012). "测量相关性的误导性价值". Wilmott Magazine. 2012 (1): 64–73. doi:10.1002/wilm.10167. S2CID 154550363.

[22] Anscombe, Francis J. (1973). "统计分析中的图形". The American Statistician. 27 (1): 17–21. doi:10.2307/2682899. JSTOR 2682899.

[23] Taraldsen, Gunnar (2021). "关于相关性的置信密度". Sankhya A (in English). 85: 600–616. doi:10.1007/s13171-021-00267-y. ISSN 0976-8378. S2CID 244594067.

[24] Taraldsen, Gunnar (2020). 关于相关性的信心. researchgate.net (preprint) (in English). doi:10.13140/RG.2.2.23673.49769.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

@@ 第17行： / 第17行： @@
 皮尔逊积矩相关系数试图通过在两个变量的数据集中建立最佳拟合线，基本上展示了预期值，而所得到的皮尔逊相关系数表明实际数据集与预期值的偏离程度。根据我们的皮尔逊相关系数的符号，如果数据集中的变量之间存在某种关系，我们可能会得到负相关或正相关。{{cn|date=November 2023}}
-两个[[random variables]] <math>X</math> 和 <math>Y</math> 之间的总体相关系数 <math>\rho_{X,Y}</math>，它们具有[[expected value]]s <math>\mu_X</math> 和 <math>\mu_Y</math> 以及[[standard deviation]]s <math>\sigma_X</math> 和 <math>\sigma_Y</math>，定义如下：
+两个[[random variables]] [math]X[/math] 和 [math]Y[/math] 之间的总体相关系数 [math]\rho_{X,Y}[/math]，它们具有[[expected value]]s [math]\mu_X[/math] 和 [math]\mu_Y[/math] 以及[[standard deviation]]s [math]\sigma_X[/math] 和 [math]\sigma_Y[/math]，定义如下：
-<math display=block>\rho_{X,Y} = \operatorname{corr}(X,Y) = {\operatorname{cov}(X,Y) \over \sigma_X \sigma_Y} = {\operatorname{E}[(X-\mu_X)(Y-\mu_Y)] \over \sigma_X\sigma_Y}, \quad \text{if}\ \sigma_{X}\sigma_{Y}>0.</math>
+[math]\rho_{X,Y} = \operatorname{corr}(X,Y) = {\operatorname{cov}(X,Y) \over \sigma_X \sigma_Y} = {\operatorname{E}[(X-\mu_X)(Y-\mu_Y)] \over \sigma_X\sigma_Y}, \quad \text{if}\ \sigma_{X}\sigma_{Y}>0.[/math]
-其中 <math>\operatorname{E}</math> 是[[expected value]]运算符，<math>\operatorname{cov}</math> 表示[[covariance]]，<math>\operatorname{corr}</math> 是相关系数的广泛使用的替代符号。只有当两个标准差都是有限且正数时，皮尔逊相关才有定义。纯粹用[[moment (mathematics)|moments]]表示的替代公式是：
+其中 [math]\operatorname{E}[/math] 是[[expected value]]运算符，[math]\operatorname{cov}[/math] 表示[[covariance]]，[math]\operatorname{corr}[/math] 是相关系数的广泛使用的替代符号。只有当两个标准差都是有限且正数时，皮尔逊相关才有定义。纯粹用[[moment (mathematics)|moments]]表示的替代公式是：
-<math display=block>\rho_{X,Y} =  {\operatorname{E}(XY)-\operatorname{E}(X)\operatorname{E}(Y)\over \sqrt{\operatorname{E}(X^2)-\operatorname{E}(X)^2}\cdot \sqrt{\operatorname{E}(Y^2)-\operatorname{E}(Y)^2} }</math>
+[math]\rho_{X,Y} =  {\operatorname{E}(XY)-\operatorname{E}(X)\operatorname{E}(Y)\over \sqrt{\operatorname{E}(X^2)-\operatorname{E}(X)^2}\cdot \sqrt{\operatorname{E}(Y^2)-\operatorname{E}(Y)^2} }[/math]
 ===相关性和独立性===
-[[Cauchy–Schwarz inequality]]的一个推论是皮尔逊相关系数的[[absolute value]]不大于1。因此，相关系数的值范围在−1和+1之间。在完美的直接（增长）线性关系（相关）情况下，相关系数为+1，在完美的逆向（减少）线性关系（'''反相关'''）情况下，相关系数为−1，<ref>Dowdy, S. and Wearden, S. (1983). "Statistics for Research", Wiley. {{ISBN|0-471-08602-9}} pp 230</ref> 在所有其他情况下，相关系数为 <math>(-1,1)</math> 之间的某个值，表明变量之间的[[linear dependence]]程度。当它接近零时，关系较少（更接近无关）。相关系数越接近−1或1，变量之间的相关性越强。
+[[Cauchy–Schwarz inequality]]的一个推论是皮尔逊相关系数的[[absolute value]]不大于1。因此，相关系数的值范围在−1和+1之间。在完美的直接（增长）线性关系（相关）情况下，相关系数为+1，在完美的逆向（减少）线性关系（'''反相关'''）情况下，相关系数为−1，<ref>Dowdy, S. and Wearden, S. (1983). "Statistics for Research", Wiley. {{ISBN|0-471-08602-9}} pp 230</ref> 在所有其他情况下，相关系数为 [math](-1,1)[/math] 之间的某个值，表明变量之间的[[linear dependence]]程度。当它接近零时，关系较少（更接近无关）。相关系数越接近−1或1，变量之间的相关性越强。
 ==皮尔逊积矩相关系数==
@@ 第34行： / 第34行： @@
 当两个变量是[[statistical independence|独立的]]时，皮尔逊相关系数为0，但反之不一定成立，因为相关系数只检测两个变量之间的线性依赖。简单来说，如果两个随机变量X和Y是独立的，那么它们是不相关的；但如果两个随机变量不相关，它们可能是独立的，也可能不是。
-[math display=block]\begin{align}
+[math]\begin{align}
 X,Y \text{ 独立} \quad & \Rightarrow \quad \rho_{X,Y} = 0 \quad (X,Y \text{ 不相关})\\
 \rho_{X,Y} = 0 \quad (X,Y \text{ 不相关})\quad & \nRightarrow \quad X,Y \text{ 独立}
 \end{align}[/math]
-例如，假设随机变量 <math>X</math> 关于零对称分布，并且 <math>Y=X^2</math>。那么 <math>Y</math> 完全由 <math>X</math> 决定，因此 <math>X</math> 和 <math>Y</math> 是完全依赖的，但它们的相关性为零；它们是[[uncorrelated|不相关的]]。然而，在特殊情况下，当 <math>X</math> 和 <math>Y</math> 是[[Joint normality|联合正态分布]]时，不相关性等同于独立性。
+例如，假设随机变量 [math]X[/math] 关于零对称分布，并且 [math]Y=X^2[/math]。那么 [math]Y[/math] 完全由 [math]X[/math] 决定，因此 [math]X[/math] 和 [math]Y[/math] 是完全依赖的，但它们的相关性为零；它们是[[uncorrelated|不相关的]]。然而，在特殊情况下，当 [math]X[/math] 和 [math]Y[/math] 是[[Joint normality|联合正态分布]]时，不相关性等同于独立性。
 尽管不相关的数据不一定意味着独立性，但如果随机变量的[[mutual information|互信息]]为0，可以检查它们是否独立。
 ===样本相关系数===
-对于一组 <math>n</math> 次测量的对 <math>(X_i,Y_i)</math>（由 <math>i=1,\ldots,n</math> 索引），可以使用''样本相关系数''来估计 <math>X</math> 和 <math>Y</math> 之间的总体皮尔逊相关系数 <math>\rho_{X,Y}</math>。样本相关系数定义如下：
+对于一组 [math]n[/math] 次测量的对 [math](X_i,Y_i)[/math]（由 [math]i=1,\ldots,n[/math] 索引），可以使用''样本相关系数''来估计 [math]X[/math] 和 [math]Y[/math] 之间的总体皮尔逊相关系数 [math]\rho_{X,Y}[/math]。样本相关系数定义如下：
-:<math>
+[math]$ r_{x y} \stackrel{\text { def }}{=} \frac{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{(n-1) s_{x} s_{y}}=\frac{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{\sqrt{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2} \sum_{i=1}^{n}\left(y_{i}-\bar{y}\right)^{2}}} $,[/math]
-r_{xy} \quad \overset{\underset{\mathrm{def}}{}}{=} \quad \frac{\sum\limits_{i=1}^n (x_i-\bar{x})(y_i-\bar{y})}{(n-1)s_x s_y}
-=\frac{\sum\limits_{i=1}^n (x_i-\bar{x})(y_i-\bar{y})}
-{\sqrt{\sum\limits_{i=1}^n (x_i-\bar{x})^2 \sum\limits_{i=1}^n (y_i-\bar{y})^2}},
-[/math]
-其中 <math>\overline{x}</math> 和 <math>\overline{y}</math> 分别是 <math>X</math> 和 <math>Y</math> 的样本[[arithmetic mean|算术平均值]]，<math>s_x</math> 和 <math>s_y</math> 是 <math>X</math> 和 <math>Y</math> 的[[Standard deviation#Corrected sample standard deviation|校正样本标准差]]。
+其中 [math]\overline{x}[/math] 和 [math]\overline{y}[/math] 分别是 [math]X[/math] 和 [math]Y[/math] 的样本[[arithmetic mean|算术平均值]]，[math]s_x[/math] 和 [math]s_y[/math] 是 [math]X[/math] 和 [math]Y[/math] 的[[Standard deviation#Corrected sample standard deviation|校正样本标准差]]。
-<math>r_{xy}</math> 的等效表达式是：
+[math]r_{xy}[/math] 的等效表达式是：
-:<math>
+[math]$ \begin{aligned} r_{x y} & =\frac{\sum x_{i} y_{i}-n \bar{x} \bar{y}}{n s_{x}^{\prime} s_{y}^{\prime}} \\ & =\frac{n \sum x_{i} y_{i}-\sum x_{i} \sum y_{i}}{\sqrt{n \sum x_{i}^{2}-\left(\sum x_{i}\right)^{2}} \sqrt{n \sum y_{i}^{2}-\left(\sum y_{i}\right)^{2}}} .\end{aligned} $[/math]
-\begin{align}
+其中 [math]s'_x[/math] 和 [math]s'_y[/math] 分别是 [math]X[/math] 和 [math]Y[/math] 的[[Standard deviation#Uncorrected sample standard deviation|未校正样本标准差]]。
-r_{xy} &=\frac{\sum x_iy_i-n \bar{x} \bar{y}}{n s'_x s'_y} \\[5pt]
-&=\frac{n\sum x_iy_i-\sum x_i\sum y_i}{\sqrt{n\sum x_i^2-(\sum x_i)^2}~\sqrt{n\sum y_i^2-(\sum y_i)^2}}.
-\end{align}
-[/math]
-其中 <math>s'_x</math> 和 <math>s'_y</math> 分别是 <math>X</math> 和 <math>Y</math> 的[[Standard deviation#Uncorrected sample standard deviation|未校正样本标准差]]。
-如果 <math>x</math> 和 <math>y</math> 是包含测量误差的测量结果，相关系数的实际限制不是−1到+1，而是更小的范围。<ref>{{cite journal|last=Francis|first=DP|author2=Coats AJ|author3=Gibson D|title=How high can a correlation coefficient be?|journal=Int J Cardiol|year=1999|volume=69|pages=185–199|doi=10.1016/S0167-5273(99)00028-5|issue=2|pmid=10549842}}</ref> 对于单独自变量的线性模型，[[Coefficient of determination|决定系数（R平方）]]是 <math>r_{xy}</math>，即皮尔逊积矩系数的平方。
+如果 [math]x[/math] 和 [math]y[/math] 是包含测量误差的测量结果，相关系数的实际限制不是−1到+1，而是更小的范围。<ref>{{cite journal|last=Francis|first=DP|author2=Coats AJ|author3=Gibson D|title=How high can a correlation coefficient be?|journal=Int J Cardiol|year=1999|volume=69|pages=185–199|doi=10.1016/S0167-5273(99)00028-5|issue=2|pmid=10549842}}</ref> 对于单独自变量的线性模型，[[Coefficient of determination|决定系数（R平方）]]是 [math]r_{xy}[/math]，即皮尔逊积矩系数的平方。
 ==示例==
@@ 第69行： / 第60行： @@
 :{| class="wikitable" style="text-align:center;"
-|+ <math>\mathrm{P}(X=x,Y=y)</math>
+|+ [math]\mathrm{P}(X=x,Y=y)[/math]
 ! {{diagonal split header|{{mvar|x}}|{{mvar|y}}}}
 !&minus;1
@@ 第88行： / 第79行： @@
 ==联合分布的边缘分布==
 针对这个联合分布，其边缘分布如下所示：
-:[math]\mathrm{P}(X=x)=
-\begin{cases}
-\frac 1 3 & \quad \text{对于 } x=0 \\
-\frac 2 3 & \quad \text{对于 } x=1
-\end{cases}
-[/math]
-:[math]\mathrm{P}(Y=y)=
+[math]\mathrm{P}(X=x)=\left\{\begin{array}{ll}\frac{1}{3} & \text { for } x=0 \\ \frac{2}{3} & \text { for } x=1\end{array}\right.[/math]
-\begin{cases}
-\frac 1 3 & \quad \text{对于 } y=-1 \\
+[math]\mathrm{P}(Y=y)=\left\{\begin{array}{ll}\frac{1}{3} & \text { for } y=-1 \\ \frac{1}{3} & \text { for } y=0 \\ \frac{1}{3} & \text { for } y=1\end{array}\right.[/math]
-\frac 1 3 & \quad \text{对于 } y=0 \\
-\frac 1 3 & \quad \text{对于 } y=1
-\end{cases}
-[/math]
 这导致了以下期望值和方差：
-:[math]\mu_X = \frac 2 3[/math]
+[math]\mu_X = \frac 2 3[/math]
-:[math]\mu_Y = 0[/math]
+[math]\mu_Y = 0[/math]
-:[math]\sigma_X^2 = \frac 2 9[/math]
+[math]\sigma_X^2 = \frac 2 9[/math]
-:[math]\sigma_Y^2 = \frac 2 3[/math]
+[math]\sigma_Y^2 = \frac 2 3[/math]
 因此：
-: [math]
+[math]\begin{aligned} \rho_{X, Y} & =\frac{1}{\sigma_{X} \sigma_{Y}} \mathrm{E}\left[\left(X-\mu_{X}\right)\left(Y-\mu_{Y}\right)\right] \\ & =\frac{1}{\sigma_{X} \sigma_{Y}} \sum_{x, y}\left(x-\mu_{X}\right)\left(y-\mu_{Y}\right) \mathrm{P}(X=x, Y=y) \\ & =\left(1-\frac{2}{3}\right)(-1-0) \frac{1}{3}+\left(0-\frac{2}{3}\right)(0-0) \frac{1}{3}+\left(1-\frac{2}{3}\right)(1-0) \frac{1}{3}=0\end{aligned}[/math]
-\begin{align}
-\rho_{X,Y} & = \frac{1}{\sigma_X \sigma_Y} \mathrm{E}[(X-\mu_X)(Y-\mu_Y)] \\[5pt]
-& = \frac{1}{\sigma_X \sigma_Y} \sum_{x,y}{(x-\mu_X)(y-\mu_Y) \mathrm{P}(X=x,Y=y)} \\[5pt]
-& = \left(1-\frac 2 3\right)(-1-0)\frac{1}{3} + \left(0-\frac 2 3\right)(0-0)\frac{1}{3} + \left(1-\frac 2 3\right)(1-0)\frac{1}{3} = 0.
-\end{align}
-[/math]
 ==秩相关系数==
@@ 第124行： / 第99行： @@
 像[[Spearman's rank correlation coefficient]]（斯皮尔曼等级相关系数）和[[Kendall's tau|Kendall的秩相关系数 (τ)]]这样的秩相关系数，用来衡量随着一个变量增加，另一个变量也倾向于增加的程度，而不要求这种增加必须通过线性关系来表示。如果当一个变量增加时，另一个变量''减少''，秩相关系数将是负的。通常认为这些秩相关系数是皮尔逊系数的替代品，用于减少计算量或使系数对分布中的非正态性不那么敏感。然而，这种观点在数学上并无太多依据，因为秩相关系数衡量的是与[[Pearson product-moment correlation coefficient]]（皮尔逊积矩相关系数）不同类型的关系，并且最好被视为不同类型的关联度量，而不是人口相关系数的另一种度量。<ref name="Yule and Kendall">Yule, G.U and Kendall, M.G. (1950), "An Introduction to the Theory of Statistics", 14th Edition (5th Impression 1968). Charles Griffin & Co. pp 258–270</ref><ref name="Kendall Rank Correlation Methods">Kendall, M. G. (1955) "Rank Correlation Methods", Charles Griffin & Co.</ref>
-为了说明秩相关的性质及其与线性相关的区别，考虑以下四对数字 <math>(x,y)</math>：
+为了说明秩相关的性质及其与线性相关的区别，考虑以下四对数字 [math](x,y)[/math]：
 :(0,&nbsp;1), (10,&nbsp;100), (101,&nbsp;500), (102,&nbsp;2000)。
-当我们从一对数到另一对数时，<math>x</math>增加，<math>y</math>也增加。这种关系是完美的，意味着<math>x</math>的增加''总是''伴随着<math>y</math>的增加。这意味着我们有一个完美的秩相关，斯皮尔曼和肯德尔的相关系数都是1，而在这个例子中皮尔逊积矩相关系数为0.7544，表明这些点远离直线。同样地，如果<math>y</math>在<math>x</math>''增加''时总是''减少''，秩相关系数将是-1，而皮尔逊积矩相关系数可能接近-1，也可能不接近，这取决于这些点与直线的接近程度。尽管在完美秩相关的极端情况下，两个系数都是相等的（都是+1或-1），但通常情况下并非如此，因此两个系数的值不能有意义地进行比较。<ref name="Yule and Kendall"/>例如，对于三对数(1,&nbsp;1) (2,&nbsp;3) (3,&nbsp;2)，斯皮尔曼系数是1/2，而肯德尔系数是&nbsp;1/3。
+当我们从一对数到另一对数时，[math]x[/math]增加，[math]y[/math]也增加。这种关系是完美的，意味着[math]x[/math]的增加''总是''伴随着[math]y[/math]的增加。这意味着我们有一个完美的秩相关，斯皮尔曼和肯德尔的相关系数都是1，而在这个例子中皮尔逊积矩相关系数为0.7544，表明这些点远离直线。同样地，如果[math]y[/math]在[math]x[/math]''增加''时总是''减少''，秩相关系数将是-1，而皮尔逊积矩相关系数可能接近-1，也可能不接近，这取决于这些点与直线的接近程度。尽管在完美秩相关的极端情况下，两个系数都是相等的（都是+1或-1），但通常情况下并非如此，因此两个系数的值不能有意义地进行比较。<ref name="Yule and Kendall"/>例如，对于三对数(1,&nbsp;1) (2,&nbsp;3) (3,&nbsp;2)，斯皮尔曼系数是1/2，而肯德尔系数是&nbsp;1/3。
 ==其他衡量随机变量相依性的方法==
@@ 第191行： / 第166行： @@
 皮尔逊相关系数用于表示两个变量之间''线性''关系的强度，但其值通常无法完全刻画它们的关系。<ref>{{cite journal |first=Babak |last=Mahdavi Damghani |year=2012|title=测量相关性的误导性价值 |journal=[[Wilmott (magazine)|Wilmott Magazine]] |volume=2012 |issue=1 |pages=64–73 |doi=10.1002/wilm.10167|s2cid=154550363 }}</ref> 特别是，如果给定[math]X[/math]的[math]Y[/math]的[[条件期望|条件均值]]，记作[math]\operatorname{E}(Y \mid X)[/math]，不是[math]X[/math]的线性函数，那么相关系数将无法完全确定[math]\operatorname{E}(Y \mid X)[/math]的形式。
-相邻图片展示了[[Anscombe's quartet]]的[[散点图]]，这是由[[Francis Anscombe]]创建的四对不同变量组成的集合。<ref>{{cite journal | last=Anscombe | first=Francis J. | year=1973 | title=统计分析中的图形 | journal=The American Statistician | volume=27 | issue=1 | pages=17–21 | jstor=2682899 | doi=10.2307/2682899}}</ref> 这四个[math]y[/math]变量具有相同的平均值（7.5）、方差（4.12）、相关系数（0.816）和回归线（[math display="inline"]y=3+0.5x[/math]）。然而，如图所示，这些变量的分布非常不同。第一个（左上角）似乎呈正态分布，并符合人们对两个相关变量在正态性假设下的预期。第二个（右上角）不是正态分布；虽然两个变量之间的明显关系可以观察到，但它不是线性的。在这种情况下，皮尔逊相关系数并未表明存在精确的函数关系：只是表明该关系在多大程度上可以被线性关系近似。在第三种情况（左下角）中，线性关系是完美的，除了一个[[离群值]]，它的影响足以将相关系数从1降低到0.816。最后，第四个例子（右下角）展示了当一个离群值足以产生高相关系数的另一个例子，尽管两个变量之间的关系不是线性的。
+相邻图片展示了[[Anscombe's quartet]]的[[散点图]]，这是由[[Francis Anscombe]]创建的四对不同变量组成的集合。<ref>{{cite journal | last=Anscombe | first=Francis J. | year=1973 | title=统计分析中的图形 | journal=The American Statistician | volume=27 | issue=1 | pages=17–21 | jstor=2682899 | doi=10.2307/2682899}}</ref> 这四个[math]y[/math]变量具有相同的平均值（7.5）、方差（4.12）、相关系数（0.816）和回归线（[math]y=3+0.5x[/math]）。然而，如图所示，这些变量的分布非常不同。第一个（左上角）似乎呈正态分布，并符合人们对两个相关变量在正态性假设下的预期。第二个（右上角）不是正态分布；虽然两个变量之间的明显关系可以观察到，但它不是线性的。在这种情况下，皮尔逊相关系数并未表明存在精确的函数关系：只是表明该关系在多大程度上可以被线性关系近似。在第三种情况（左下角）中，线性关系是完美的，除了一个[[离群值]]，它的影响足以将相关系数从1降低到0.816。最后，第四个例子（右下角）展示了当一个离群值足以产生高相关系数的另一个例子，尽管两个变量之间的关系不是线性的。
 这些例子表明，相关系数作为[[总结统计量]]，不能替代对数据的视觉检查。有时认为这些例子表明皮尔逊相关假设数据遵循[[正态分布]]，但这只是部分正确。<ref name="thirteenways"/> 皮尔逊相关可以准确地计算任何具有有限[[协方差矩阵]]的分布，这包括实际遇到的大多数分布。然而，皮尔逊相关系数（连同样本均值和方差）只是[[充分统计量]]，如果数据来自[[多元正态分布]]。因此，只有当数据来自多元正态分布时，皮尔逊相关系数才能完全刻画变量之间的关系。
 ==双变量正态分布==
-如果一对随机变量[math](X,Y)[/math]遵循[[双变量正态分布]]，则给定[math]Y[/math]的条件下[math]X[/math]的条件均值[math]\operatorname{\boldsymbol\mathcal E}(X \mid Y)[/math]是[math]Y[/math]的线性函数，而给定[math]X[/math]的条件下[math]Y[/math]的条件均值[math]\operatorname{\boldsymbol\mathcal E}(Y \mid X)[/math]是[math]X[/math]的线性函数。[math]X[/math]和[math]Y[/math]之间的相关系数[math]\rho_{X,Y}[/math]，以及[math]X[/math]和[math]Y[/math]的[[边缘分布|边缘]]均值和方差确定了这种线性关系：
+如果一对随机变量[math]\ (X,Y)\ [/math]遵循[[双变量正态分布]]，则给定[math]Y[/math]的条件下[math]X[/math]的条件均值[math]\mathcal{E}(X \mid Y)[/math]是[math]Y[/math]的线性函数，而给定[math]X[/math]的条件下[math]Y[/math]的条件均值[math]\mathcal{E}(Y \mid X)[/math]是[math]X[/math]的线性函数。[math]X[/math]和[math]Y[/math]之间的相关系数[math]\ \rho_{X,Y}\ [/math]，以及[math]X[/math]和[math]Y[/math]的[[边缘分布|边缘]]均值和方差确定了这种线性关系：
-:<math>\operatorname{\boldsymbol\mathcal E}(Y \mid X ) = \operatorname{\boldsymbol\mathcal E}(Y) + \rho_{X,Y} \cdot \sigma_Y \cdot \frac{X-\operatorname{\boldsymbol\mathcal E}(X)}{\sigma_X},</math>
+[math]\mathcal{E}(Y \mid X)=\mathcal{E}(Y)+\rho_{X, Y} \cdot \sigma_{Y} \cdot \frac{X-\mathcal{E}(X)}{\sigma_{X}}[/math]
-其中[math]\operatorname{\boldsymbol\mathcal E}(X)[/math]和[math]\operatorname{\boldsymbol\mathcal E}(Y)[/math]分别是[math]X[/math]和[math]Y[/math]的期望值，[math]\sigma_X[/math]和[math]\sigma_Y[/math]分别是[math]X[/math]和[math]Y[/math]的标准差。
+其中[math]\mathcal{E}(X)[/math]和[math]\mathcal{E}(Y)[/math]分别是[math]X[/math]和[math]Y[/math]的期望值，[math]\sigma_X[/math]和[math]\sigma_Y[/math]分别是[math]X[/math]和[math]Y[/math]的标准差。
 经验相关系数[math]r[/math]是相关系数[math]\rho[/math]的[[估计|估计值]]。对[math]\rho[/math]的分布估计由以下公式给出：
-: <math display="block">\pi ( \rho \mid r ) =
+[math]\pi(\rho \mid r)=\frac{\Gamma(N)}{\sqrt{2 \pi} \cdot \Gamma\left(N-\frac{1}{2}\right)} \cdot\left(1-r^{2}\right)^{\frac{N-2}{2}} \cdot\left(1-\rho^{2}\right)^{\frac{N-3}{2}} \cdot(1-r \rho)^{-N+\frac{3}{2}} \cdot F_{\text {Нур }}\left(\frac{3}{2},-\frac{1}{2} ; N-\frac{1}{2} ; \frac{1+r \rho}{2}\right)[/math]
-\frac{\Gamma(N)}{\sqrt{ 2\pi } \cdot
-\Gamma( N - \tfrac{1}{2})} \cdot
-\bigl( 1 - r^2 \bigr)^{\frac{N - 2}{2}} \cdot
-\bigl( 1 - \rho^2 \bigr)^{\frac{N - 3}{2}} \cdot
-\bigl( 1 - r \rho \bigr)^{- N + \frac{3}{2}} \cdot F_\mathsf{Hyp} \left(\tfrac{3}{2}, -\tfrac{1}{2}; N - \tfrac{1}{2}; \frac{1 + r \rho}{2} \right)</math>
 其中[math]F_\mathsf{Hyp}[/math]是[[高斯超几何函数]]。
@@ 第271行： / 第241行： @@
 {{Statistics |相关分析}}
-{{Authority control}}
 {{DEFAULTSORT:Correlation And Dependence}}

相关分析：修订间差异

2024年1月22日 (一) 01:05的最新版本

皮尔逊积矩相关系数

相关性和独立性

皮尔逊积矩相关系数

样本相关系数

示例

联合分布的边缘分布

秩相关系数

其他衡量随机变量相依性的方法

对数据分布的敏感性

相关矩阵

最近有效相关矩阵

随机过程的不相关性和独立性

常见误解

相关性和因果关系

简单线性相关性

双变量正态分布

另请参阅

引用

延伸阅读

外部链接