4.5 KiB
联合正态分布下的条件期望与条件方差公式
已知条件与记号
设
[ x = \begin{bmatrix} r \ g \end{bmatrix} ] 服从联合正态分布,均值向量
[ \mu = \begin{bmatrix} \mu_r \ \mu_g \end{bmatrix} ] 协方差矩阵
[ \Sigma = \begin{bmatrix} \Sigma_{rr} & \Sigma_{rg} \ \Sigma_{gr} & \Sigma_{gg} \end{bmatrix} ] 其中
[ \Sigma_{rr} = \text{Var}[r], \quad \Sigma_{rg} = \text{Cov}[r,g], \quad \Sigma_{gr} = \Sigma_{rg}^T, \quad \Sigma_{gg} = \text{Var}[g] ]
联合分布与精度矩阵
设联合精度矩阵为
[ \Lambda = \Sigma^{-1} = \begin{bmatrix} \Lambda_{rr} & \Lambda_{rg} \ \Lambda_{gr} & \Lambda_{gg} \end{bmatrix} ]
联合概率密度函数(忽略常数):
[ p(r, g) \propto \exp\left[ -\frac{1}{2} (x - \mu)^T \Lambda (x - \mu) \right] ]
展开二次型
令
[ y = r - \mu_r, \quad z = g - \mu_g ] 则
[ (x - \mu) = \begin{bmatrix} y \ z \end{bmatrix} ] 二次型:
[ (x - \mu)^T \Lambda (x - \mu) = \begin{bmatrix} y^T & z^T \end{bmatrix} \begin{bmatrix} \Lambda_{rr} & \Lambda_{rg} \ \Lambda_{gr} & \Lambda_{gg} \end{bmatrix} \begin{bmatrix} y \ z \end{bmatrix} ]
[ = y^T \Lambda_{rr} y + y^T \Lambda_{rg} z + z^T \Lambda_{gr} y + z^T \Lambda_{gg} z ] 因为 (\Lambda_{gr} = \Lambda_{rg}^T),且结果为标量,所以中间两项相等:
[ y^T \Lambda_{rg} z + z^T \Lambda_{gr} y = 2 y^T \Lambda_{rg} z ] 因此:
[ (x - \mu)^T \Lambda (x - \mu) = y^T \Lambda_{rr} y + 2 y^T \Lambda_{rg} z + z^T \Lambda_{gg} z ]
固定 (g) 时的条件分布
固定 (g) 意味着 (z) 固定。将上述表达式视为 (y) 的函数:
[ p(r \mid g) \propto \exp\left[ -\frac12 \left( y^T \Lambda_{rr} y + 2 y^T \Lambda_{rg} z + z^T \Lambda_{gg} z \right) \right] ] 与 (y) 无关的项 (z^T \Lambda_{gg} z) 可提到比例常数中:
[ p(r \mid g) \propto \exp\left[ -\frac12 \left( y^T \Lambda_{rr} y + 2 y^T \Lambda_{rg} z \right) \right] ]
配方法找条件均值与条件精度
对 (y) 的二次型加线性项配方:
[ y^T \Lambda_{rr} y + 2 y^T \Lambda_{rg} z = (y - m)^T \Lambda_{rr} (y - m) - m^T \Lambda_{rr} m ] 其中 (m) 满足:
[ \Lambda_{rr} m = - \Lambda_{rg} z ] 即:
[ m = - \Lambda_{rr}^{-1} \Lambda_{rg} z ] (注意:这里 (m) 是给定 (g) 时 (y) 的条件均值。)
于是:
[ p(r \mid g) \propto \exp\left[ -\frac12 (y - m)^T \Lambda_{rr} (y - m) \right] ] 因为 (- m^T \Lambda_{rr} m) 与 (y) 无关,被吸收进归一化常数。
条件期望
由于 (y = r - \mu_r),(z = g - \mu_g),且条件分布为:
[ p(r \mid g) \propto \exp\left[ -\frac12 (y - m)^T \Lambda_{rr} (y - m) \right] ] 这表示 (r \mid g \sim N(\mu_r + m, \Lambda_{rr}^{-1}))
因此:
[ E[r \mid g] = \mu_r + m = \mu_r - \Lambda_{rr}^{-1} \Lambda_{rg} (g - \mu_g) ]
用分块矩阵求逆公式简化
由分块矩阵求逆公式:
[ \Lambda_{rg} = - \Lambda_{rr} \Sigma_{rg} \Sigma_{gg}^{-1} ] 代入上式:
[ E[r \mid g] = \mu_r - \Lambda_{rr}^{-1} ( - \Lambda_{rr} \Sigma_{rg} \Sigma_{gg}^{-1} ) (g - \mu_g) ]
[ = \mu_r + \Sigma_{rg} \Sigma_{gg}^{-1} (g - \mu_g) ]
条件期望公式
[ E[r \mid g] = \mu_r + \Sigma_{rg} \Sigma_{gg}^{-1} (g - \mu_g) ] 即:
[ E[r \mid g] = E[r] + \text{Cov}[r,g] \cdot \text{Var}^{-1}[g] \cdot (g - E[g]) ]
条件方差
由
[ p(r \mid g) \propto \exp\left[ -\frac12 (y - m)^T \Lambda_{rr} (y - m) \right] ]
可得:
- 条件精度矩阵 = (\Lambda_{rr})
- 条件协方差矩阵 = ((\Lambda_{rr})^{-1})
因此:
[ \text{Var}[r \mid g] = \Lambda_{rr}^{-1} ]
分块矩阵求逆公式
已知分块矩阵求逆公式(当 (\Sigma_{gg}) 可逆时):
[ \Lambda_{rr} = (\Sigma_{rr} - \Sigma_{rg} \Sigma_{gg}^{-1} \Sigma_{gr})^{-1} ]
[ \Lambda_{rg} = - \Lambda_{rr} , \Sigma_{rg} , \Sigma_{gg}^{-1} ]
[ \Lambda_{gr} = - \Sigma_{gg}^{-1} \Sigma_{gr} , \Lambda_{rr} ]
[ \Lambda_{gg} = \Sigma_{gg}^{-1} + \Sigma_{gg}^{-1} \Sigma_{gr} , \Lambda_{rr} , \Sigma_{rg} , \Sigma_{gg}^{-1} ]
最终的条件方差公式
由第7步和第8步:
[ \text{Var}[r \mid g] = \Lambda_{rr}^{-1} = \Sigma_{rr} - \Sigma_{rg} \Sigma_{gg}^{-1} \Sigma_{gr} ]
代入原记号:
[ \Sigma_{rr} = \text{Var}[r], \quad \Sigma_{rg} = \text{Cov}[r,g], \quad \Sigma_{gr} = \text{Cov}[g,r], \quad \Sigma_{gg} = \text{Var}[g] ] 因此:
[ \text{Var}[r \mid g] = \text{Var}[r] - \text{Cov}[r,g] \cdot \text{Var}^{-1}[g] \cdot \text{Cov}[g,r] ]