プログラミング練習: Gamarnik, Tsisiklis. Fundamentals of Probability 07日目離散確率変数の期待値

David Gamarnik, and John Tsitsiklis. 6.436J Fundamentals of Probability. Fall 2008. Massachusetts Institute of Technology: MIT OpenCourseWare, https://ocw.mit.edu. License: Creative Commons BY-NC-SA.

Lecture 6. Discrete random variables and their expectations

4. Expected Values(期待値)

4.1 Preliminaries: infinite sums

$a_1 + a_2 + \cdots$ という級数があるとき，全ての項が非負ならその順番を並び替えても級数の和がもとと同じになる．また，項が必ずしも非負でない場合に並び替えで和が変わらない条件というのは，絶対収束性であった．すなわち $S_{+}, S_{-}$ をそれぞれ級数から非負の項のみ取り出した和と，負の項のみ取り出した和とするとき， $S_{+}, S_{-}$ がともに有限であればよかった．
また, $\{a_{ij}\}_{i, j}$ という二重のインデックスが振られた数列の和についても，全ての項が非負であるか絶対収束すれば
$\sum_{i}\sum_j a_{ij} = \sum_j \sum_i a_{ij} = \sum_{i,j} a_{ij}$
と書けるのであった．

4.2 Definition of the expectation

random variable X のPMFを要約する値の一つに,expectation(期待値)がある.

Definition 6-2(Expectation)

discrete random variable $X$ とそのPMF $p_X$ があるとき， $X$ のexpected value(expectation, or mean)を
$\mathbb{E}[X] = \sum_x xp_X(x)$
と定める．これが常にはwell-definedでないことはすでに注意した．

4.3 Properties of the expectation

expectationの別の表現として, $X$ が非負の整数値しか取れないなら,
$\mathbb{E}[X] = \sum_{n \geq 0} P(X > n)$
がある.

Proposition 6-3

discrete random variable $X$ と $g: \mathbb{R} \rightarrow \mathbb{R}$ があるとき,
$E[g(x)] = \sum_{\{x| p_X(x) > 0\}} g(x)p_X(x)$

この定理で, $g(x)=x^2$ とすると, $Y=X^2$ のexpectationが $\mathbb{E}[Y]=\sum_x x^2p_X(x)$ とわかる.
$\mathbb{E}[Y]$ を $\mathbb{E}[X^2]$ とも書く． $\mathbb{E}[X^2]$ を $X$ のsecond momentという．より一般に， $\mathbb{E}[X^r]$ を $X$ の $r$ th momentという.さらに, $\mathbb{E}[(X-\mathbb{E}[X])^r]$ を $X$ の $r$ th central momentといい,特に $X$ の2nd central moment $\mathbb{E}[(X - \mathbb{E}[X])^2]$ を $X$ のvariance(分散)といい, $\text{V}[X]$ とか， $\text{var}[X]$ と書く．
さらに $X$ のvarianceの根をstandard deviation(標準偏差)といい， $\sigma_X$ とか，単に $\sigma$ と書く.

Proposition 6-4

$X, Y$ 同じprobability spaecのdiscrete random variableとする．以下，両辺でそれぞれの値に確定値が定義されているとき

(a) $X \geq 0, a.s. \Rightarrow \mathbb{E}[X] \geq 0$
(b) $X = c, a.s. \Rightarrow \mathbb{E}[X] = c$
(c) $a, b \in \mathbb{R}$ に, $E[aX+bY] = aE[X] + bE[y]$
(d) $E[X] < \infty \Rightarrow V(X) = E[X^2]-E[X]^2$
(e) $V(aX) = a^2 V(X)$
(f) $X, Y$ が独立なら, $E[XY]=E[X]E[Y], V(X+Y) = V(X)+V(Y)$
(g) $X_1, ..., X_n$ が独立なら
$E[\Pi X_1] = \Pi E[X_i],\ \ V(\sum X_i) = \sum V(X_i)$

Lecture 7. Discrete random variables and their expectations (cont)

1. Comments on Expected Values

(a) $E[X]$ は $\sum_{x: x<0} xp_X(x), \sum_{x: x>0} xp_X(X)$ がともに有限である場合にのみ有限確定値をもつ.これは $E[|X|] = \sum_x |x| p_X(x) < \infty$ と同値である.これを満たすrandom variableはintegrableであるという．
(b) 任意の $X$ に $E[X^2]$ は $\infty$ を許せば常に定義されている.特に $E[X^2] < \infty$ であるとき， $X$ はsquare integrableという．
(c) $|x| \leq 1 + x^2$ から， $E[|X|] \leq 1 + E[X^2]$ である．よって,square integrableならばintegrableである．
(d) $V(X)=E[X^2]-E[X]^2$ だから，(i) $X$ がsquare integrable ならば $V[X]<\infty$ (ii) $X$ がintegrableだがsquare integrableでないとき， $V[X]=\infty$ . (iii) $X$ がintegrable でないなら， $V[X]$ は未定義．

2. Expected values of Some Common Random Variables

(a) Bernoulli
$X \sim Ber(p)$ であるとき，
$\begin{aligned} &E[X] = 1 \cdot p + 0 \cdot (1-p) = p \\ &V[X] = E[X^2]-E[X]^2 = 1^2 \cdot P + 0^2 \cdot (1-p) - p^2 = p(1-p)\end{aligned}$
(b) Binomial
$X \sim Bin(n, p)$ とする.このとき $X=\sum_{i=1}^n X_i$ と, $X_i \sim Ber(p)$ によって書ける.
よって
$\begin{aligned} & E[X]=\sum_{i=1}^n E[X_i] = np \\ &V[X]=\sum_{i=1}^n V[X_i] = np(1-p) \end{aligned}$
(C) Geometric
$X \sim Geo(p)$ とする． $E[X] = \sum_{n\geq 0} P(X>n)$ を使う．
$P(X > n) = \sum_{j=n+1}^\infty (1-p)^{j-1}p = (1-p)^n$ から,
$\begin{aligned} &E[X] = \sum_{n \geq 0} (1-p)^n = 1/p \\ & V[X] = \frac{1-p}{p^2}\end{aligned}$
(d) * Poisson*
$X \sim Poi(\lambda)$ とする．
$\begin{aligned} E[X] &= e^{-\lambda}\sum_{n \geq 0} n\frac{\lambda^n}{n!}\\ &=e^{-\lambda} \sum_{n \geq 1} \frac{\lambda^n}{(n-1)!} \\&=\lambda e^{-\lambda} \sum_{n \geq 0} \frac{\lambda^n}{n!} = \lambda \end{aligned}$
また,
$V[X] = \lambda$
これは,Poisson分布がBinomial分布の $\lambda =np, n \rightarrow \infty, p \rightarrow 0$ の極限であることからも言える．
(e) Power
$X \sim Pow(\alpha)$ であるとき，
$E[X] = \sum_{k \geq 0} \frac{1}{(k+1)^\alpha}$
これをRiemmanの $\zeta$ functionといい， $\zeta(\alpha)$ と書く．

3. Covariance and Correlation

3.1 Covariance

Definition

square integrable random variable $X, Y$ について，そのcovariance(分散)を
$cov(X, Y):= E[(X-E[X])(Y-E[Y])]$
と定める. $|XY| \leq \frac{X^2+Y^2}{2}$ から， $X,Y$ がsquare integrableという仮定のもとで， $cov(X, Y) < \infty$ である．

$X-E[X]$ と $Y-E[Y]$ が同じ符号を取りやすいときは $cov(X, Y) > 0$ ,異なる符号を取りやすいときは $cov(X, Y)<0$ と考えることができる．よって， $cov(X, Y)$ の符号は $X$ と $Y$ の関係を要約する．
以下にcovarianceの重要な性質をいくつか挙げる.
(a) $cov(X, X) = V(X)$
(b) $cov(X, Y+a) = cov(X, Y)$
(c) $cov (X, Y) = cov(Y, X)$
(d) $cov(X, aY+bZ) = a\cdot cov(X, Y) + b \cdot cov(X, Z)$
また,
$cov(X, Y) = E[XY]-E[X]E[Y]$
である．
$X, Y$ が独立であれば $E[XY]=E[X]E[Y]$ であって， $cov(X, Y)=0$ である．逆は必ずしも成り立たない．

3.2 Variance of the sum of random variables

$\tilde{X_i} = X_i - E[X_i]$ とすると，
$\begin{aligned} V(\sum_{i=1}^n X_i) &= E\left[\sum_{i=1}^n \sum_{j=1}^n \tilde{X_i} \tilde{X_j} \right] \\ &= \sum_i \sum_j E[\tilde{X_i} \tilde{X_j}] \\ &= \sum E[\tilde{X_i}^2] + 2 \sum_{i=1}^{n-1}\sum_{j=i+1}^n E[\tilde{X_i} \tilde{X_j}] \\ &= \sum V(X_i) + 2 \sum_{i=1}^{n-1}\sum_{j=i+1}^n cov(X_i, X_j)\end{aligned}$
である．特に,
$V(X_1 + X_2) = V(X_1)+V(X_2) + 2cov(X_1, X_2)$
である．

Correlation coefficient

$X, Y$ のcorrelation coefficient(相関係数)を
$\rho(X, Y) := \frac{cov(X, Y)}{\sqrt{V(X)V(Y)}}$
と定める．正規化されたcovarianceと考えることができる．

Theorem 7-1

$X, Y$ は正のvarianceを持ったdiscrete random variableとする．また $\rho(X, Y)$ を単に $\rho$ とする．このとき
(a) $-1 \leq \rho \leq 1$
(b) $\rho=\pm 1$ のとき， $Y-E[Y]= a(X-E[X])$ の確率が1となるような定数 $a$ がある．

proof.

(a) $\tilde{X} = X -E[X], \tilde{Y} = Y -E[Y]$ とする．Cauchy-Scwartzの不等式より，
$(\rho (X, Y))^2 = \frac{(E[\tilde{X} \tilde{Y}])^2}{E[\tilde{X}^2]E[\tilde{Y}^2]} \leq 1$
$\because cov(X, Y) = cov(\tilde{X}, \tilde{Y}) = E(\tilde{X}\tilde{Y})$
(b) $\tilde{Y} = a \tilde{X}$ なら，
$\rho (X, Y) = \frac{E[\tilde{X}a\tilde{X}]}{\sqrt{E({\tilde{X}^2}) E((a\tilde{X})^2)}} = \frac{aV(X)}{|a|V(X)} = \frac{a}{|a|} = \pm 1$
逆に， $(\rho(X, Y))^2 = 1$ とすると， $E[\tilde{X}^2] E[\tilde{Y}^2]=(E[\tilde{X}\tilde{Y}])^2$ である.
ここで
$E\left[ \left(\tilde{X} - \frac{E(\tilde{X}\tilde{Y})}{E(\tilde{Y}^2)}Y\right)^2 \right] = E[\tilde{X}^2] - \frac{(E(\tilde{X}\tilde{Y}))^2}{E[\tilde{Y}^2]}$
を考えると， $\tilde{X} - \frac{E(\tilde{X}\tilde{Y})}{E(\tilde{Y}^2)}\tilde{Y}$ というrandom variableが0をとる確率は1である．よって示せた．

4. Indicator Variables and the Inclusion-Exclusion Formula

indicator functionは，event $A$ に対して， $I_A: \Omega \ni \omega \mapsto \begin{cases} 1 \ \ \ & \omega \in A \\ 0 & \omega \notin A \end{cases}$
と定義され， $E[I_A]=P(A)$ である．indicator functionによって今後の様々な定理や証明を簡潔に書ける．

4.1 The inclusion-exclusion formula

$I_{A\cap B} = I_a I_B$ ,また $I_{A \cup B} = I_A +I_B -I_AI_B$ である．
両辺のexpectationを考えると，
$P(A\cup B) = P(A) +P(B) - P(A \cap B)$
これを一般化する． $\{A_j\}_1^n \subset \mathcal{F}$ とする． $B=\sup A_j$ とすると，
$I_B = 1 - \Pi(1-I_{A_j})$
が成立．両辺のexpectationを取って,
$P(B) = 1 - \sum_{1 \leq j \leq n} P(A_j) + \sum_{1 \leq i < j \leq n} P(A_i \cap A_j) - \sum_{1\leq i<j<k\leq n} P(A_i \cap A_j \cap A_k) + \cdots (-1)^n P(A_1 \cap \cdots \cap A_n)$

これをInclusion-exclusion theoremという．

5. Conditional Expectations

$p_{X|A}(x) = P(X=x | A)$ によって， $A \in \mathcal{F}$ と $X$ にいてのconditional PMF $p_{X|A}$ が定義でき，さらに $p_{X|A}$ にはconditional expectationが定義できる．

Definition 7-2

$A \in \mathcal{F}, P(A) >0$ とdiscrete random variable $X$ があるとき， $A$ がある時の $X$ のconditional expectationを
$E[X|A]:= \sum_x xp_{X|A}(x)$
と定める．

また， $E[X|Y=y]$ という形のconditional expectationとは, $A=\{Y=y\}$ とした場合，すなわち
$E[X|Y=y] = \sum_x xp_{X|Y} (x|y)$
である． $X$ が非負であるかintegrableであるならconditional expectationは有限値を取る．

5.1 The total expectation theorem

$\{A_i\}\subset \mathcal{F}$ は $\Omega$ の分割とする．random variable $Y$ を $Y(\omega) = \begin{cases} i \ \ \ &(\omega \in A_i) \\ 0 &(\text{otherwise})\end{cases}$
と定める．このとき $p_Y(i) =P(A_i), E[X|Y =i] =E[X|A_i]$ である．したがって
$E[X] = \sum_i E[X|A_i]P(A_i)$
である．

Example(The mean of the geometric)

$X \sim geo(p)$ とする．すなわち $p_X(k)=(1-p)^{k-1}p$ ．ここで

$\begin{aligned} P(X-1=k | X>1) &= \frac{P(X=k+1, X>1)}{P(X>1)} \\ &= \frac{P(X=k+1)}{P(X>1)} \\ &= \frac{(1-p)^kp}{1-p}=(1-p)^{k-1}p \\ &=P(X=k)\end{aligned}$
が成立．コイントスの例をとれば，次の $k$ 回目のコイントスで表が出る確率は，1回コイントスをした時点で $k+1$ 回目のコイントスで表が出る確率に等しいということ．このようなdistributionをmemoryless(無記憶)であるという．

$E[X] = E[X|X>1]P(X>1)+E[X|X=1]P(X=1) = (1+E[X])(1-p) + 1 \cdot p$
$E[X]$ について解いて, $E[X]=1/p$ .
同様に
$E[X^2]=E[X^2|X>1]P(X>1)+E[X^2|X=1]P(X=1)$
$E[X^2|X>1] = E[(X-1)^2|X>1]+E[2(X-1)+1|X>1]=E[X^2]+2/p+1$ から，
$E[X^2]=(1-p)(E[X^2]+2/p+1)+p$
これを解いて
$E[X^2] = 2/p^2 - 1/p$
したがって
$V(X) = E[X^2]-(E[X])^2 = \frac{1-p}{p^2}$

5.2 The conditional expectation as a random variable

$X, Y$ をdiscrete random variableとする． $y$ を固定すると $E[X|Y=y]$ は実数として定まり， $y$ の関数と考えることができる． $E[X|Y=y]$ を $y$ の関数と考えて $E[X|Y]:\mathbb{R} \rightarrow \mathbb{R}$ と書くとすると， $E[X|Y]$ はrandom variableである．

Theorem 7-2

$g: \mathbb{R} \rightarrow \mathbb{R}$ がmeasurableで， $Xg(Y)$ が非負かintegrableであるとき，
$E[E[X|Y]g(Y)] = E[Xg(Y)]$
であって，特に $g=1$ とすれば， $E[E[X|Y]]=E[X]$ である．

proof.

$\begin{aligned}E[E[X|Y]g(Y)] &= \sum_y E[X|Y=y]g(y)p_Y(y) \\ &= \sum_y \sum_x xp_{X|Y}(x|y)g(y)p_Y(y) \\&=\sum_{x, y} xg(y)p_{X,Y}(x, y) = E[Xg(Y)] \end{aligned}$

系: $E[(E[X|Y]-X)g(Y)]=0$
$E[X|Y]$ は $X$ の $Y$ からのestimationと考えられて， $E[X|Y]-X$ はestimation errorである．この定理は，estimation errorがいかなる関数 $g$ ともcorrelationを持たないことを主張している．

プログラミング練習

2017年7月12日水曜日

Gamarnik, Tsisiklis. Fundamentals of Probability 07日目離散確率変数の期待値

Lecture 6. Discrete random variables and their expectations

4. Expected Values(期待値)

4.1 Preliminaries: infinite sums

4.2 Definition of the expectation

Definition 6-2(Expectation)

4.3 Properties of the expectation

Proposition 6-3

Proposition 6-4

Lecture 7. Discrete random variables and their expectations (cont)

1. Comments on Expected Values

2. Expected values of Some Common Random Variables

3. Covariance and Correlation

3.1 Covariance

Definition

3.2 Variance of the sum of random variables

Correlation coefficient

Theorem 7-1

4. Indicator Variables and the Inclusion-Exclusion Formula

4.1 The inclusion-exclusion formula

5. Conditional Expectations

Definition 7-2

5.1 The total expectation theorem

Example(The mean of the geometric)

5.2 The conditional expectation as a random variable

Theorem 7-2

0 件のコメント:

コメントを投稿

2017年7月12日水曜日

Gamarnik, Tsisiklis. Fundamentals of Probability 07日目 離散確率変数の期待値

Lecture 6. Discrete random variables and their expectations

4. Expected Values(期待値)

4.1 Preliminaries: infinite sums

4.2 Definition of the expectation

Definition 6-2(Expectation)

4.3 Properties of the expectation

Proposition 6-3

Proposition 6-4

Lecture 7. Discrete random variables and their expectations (cont)

1. Comments on Expected Values

2. Expected values of Some Common Random Variables

3. Covariance and Correlation

3.1 Covariance

Definition

3.2 Variance of the sum of random variables

Correlation coefficient

Theorem 7-1

4. Indicator Variables and the Inclusion-Exclusion Formula

4.1 The inclusion-exclusion formula

5. Conditional Expectations

Definition 7-2

5.1 The total expectation theorem

Example(The mean of the geometric)

5.2 The conditional expectation as a random variable

Theorem 7-2

0 件のコメント:

コメントを投稿

Gamarnik, Tsisiklis. Fundamentals of Probability 07日目離散確率変数の期待値