Iterated Expectation for Continuous Random Variables

The proposition in probability theory known as the law of total expectation,[1] the law of iterated expectations [2] (LIE), Adam's law,[3] the tower rule,[4] and the smoothing theorem,[5] among other names, states that if X {\displaystyle X} is a random variable whose expected value E ( X ) {\displaystyle \operatorname {E} (X)} is defined, and Y {\displaystyle Y} is any random variable on the same probability space, then

E ( X ) = E ( E ( X Y ) ) , {\displaystyle \operatorname {E} (X)=\operatorname {E} (\operatorname {E} (X\mid Y)),}

i.e., the expected value of the conditional expected value of X {\displaystyle X} given Y {\displaystyle Y} is the same as the expected value of X {\displaystyle X} .

One special case states that if { A i } i {\displaystyle {\left\{A_{i}\right\}}_{i}} is a finite or countable partition of the sample space, then

E ( X ) = i E ( X A i ) P ( A i ) . {\displaystyle \operatorname {E} (X)=\sum _{i}{\operatorname {E} (X\mid A_{i})\operatorname {P} (A_{i})}.}

Note: The conditional expected value E(X | Z) is a random variable whose value depend on the value of Z. Note that the conditional expected value of X given the event Z = z is a function of z. If we write E(X | Z = z) = g(z) then the random variable E(X | Z) is g(Z). Similar comments apply to the conditional covariance.

Example [edit]

Suppose that only two factories supply light bulbs to the market. Factory X {\displaystyle X} 's bulbs work for an average of 5000 hours, whereas factory Y {\displaystyle Y} 's bulbs work for an average of 4000 hours. It is known that factory X {\displaystyle X} supplies 60% of the total bulbs available. What is the expected length of time that a purchased bulb will work for?

Applying the law of total expectation, we have:

E ( L ) = E ( L X ) P ( X ) + E ( L Y ) P ( Y ) = 5000 ( 0.6 ) + 4000 ( 0.4 ) = 4600 {\displaystyle {\begin{aligned}\operatorname {E} (L)&=\operatorname {E} (L\mid X)\operatorname {P} (X)+\operatorname {E} (L\mid Y)\operatorname {P} (Y)\\[3pt]&=5000(0.6)+4000(0.4)\\[2pt]&=4600\end{aligned}}}

where

Thus each purchased light bulb has an expected lifetime of 4600 hours.

Proof in the finite and countable cases [edit]

Let the random variables X {\displaystyle X} and Y {\displaystyle Y} , defined on the same probability space, assume a finite or countably infinite set of finite values. Assume that E [ X ] {\displaystyle \operatorname {E} [X]} is defined, i.e. min ( E [ X + ] , E [ X ] ) < {\displaystyle \min(\operatorname {E} [X_{+}],\operatorname {E} [X_{-}])<\infty } . If { A i } {\displaystyle \{A_{i}\}} is a partition of the probability space Ω {\displaystyle \Omega } , then

E ( X ) = i E ( X A i ) P ( A i ) . {\displaystyle \operatorname {E} (X)=\sum _{i}{\operatorname {E} (X\mid A_{i})\operatorname {P} (A_{i})}.}

Proof.

E ( E ( X Y ) ) = E [ x x P ( X = x Y ) ] = y [ x x P ( X = x Y = y ) ] P ( Y = y ) = y x x P ( X = x , Y = y ) . {\displaystyle {\begin{aligned}\operatorname {E} \left(\operatorname {E} (X\mid Y)\right)&=\operatorname {E} {\Bigg [}\sum _{x}x\cdot \operatorname {P} (X=x\mid Y){\Bigg ]}\\[6pt]&=\sum _{y}{\Bigg [}\sum _{x}x\cdot \operatorname {P} (X=x\mid Y=y){\Bigg ]}\cdot \operatorname {P} (Y=y)\\[6pt]&=\sum _{y}\sum _{x}x\cdot \operatorname {P} (X=x,Y=y).\end{aligned}}}

If the series is finite, then we can switch the summations around, and the previous expression will become

x y x P ( X = x , Y = y ) = x x y P ( X = x , Y = y ) = x x P ( X = x ) = E ( X ) . {\displaystyle {\begin{aligned}\sum _{x}\sum _{y}x\cdot \operatorname {P} (X=x,Y=y)&=\sum _{x}x\sum _{y}\operatorname {P} (X=x,Y=y)\\[6pt]&=\sum _{x}x\cdot \operatorname {P} (X=x)\\[6pt]&=\operatorname {E} (X).\end{aligned}}}

If, on the other hand, the series is infinite, then its convergence cannot be conditional, due to the assumption that min ( E [ X + ] , E [ X ] ) < . {\displaystyle \min(\operatorname {E} [X_{+}],\operatorname {E} [X_{-}])<\infty .} The series converges absolutely if both E [ X + ] {\displaystyle \operatorname {E} [X_{+}]} and E [ X ] {\displaystyle \operatorname {E} [X_{-}]} are finite, and diverges to an infinity when either E [ X + ] {\displaystyle \operatorname {E} [X_{+}]} or E [ X ] {\displaystyle \operatorname {E} [X_{-}]} is infinite. In both scenarios, the above summations may be exchanged without affecting the sum.

Proof in the general case [edit]

Let ( Ω , F , P ) {\displaystyle (\Omega ,{\mathcal {F}},\operatorname {P} )} be a probability space on which two sub σ-algebras G 1 G 2 F {\displaystyle {\mathcal {G}}_{1}\subseteq {\mathcal {G}}_{2}\subseteq {\mathcal {F}}} are defined. For a random variable X {\displaystyle X} on such a space, the smoothing law states that if E [ X ] {\displaystyle \operatorname {E} [X]} is defined, i.e. min ( E [ X + ] , E [ X ] ) < {\displaystyle \min(\operatorname {E} [X_{+}],\operatorname {E} [X_{-}])<\infty } , then

E [ E [ X G 2 ] G 1 ] = E [ X G 1 ] (a.s.) . {\displaystyle \operatorname {E} [\operatorname {E} [X\mid {\mathcal {G}}_{2}]\mid {\mathcal {G}}_{1}]=\operatorname {E} [X\mid {\mathcal {G}}_{1}]\quad {\text{(a.s.)}}.}

Proof. Since a conditional expectation is a Radon–Nikodym derivative, verifying the following two properties establishes the smoothing law:

The first of these properties holds by definition of the conditional expectation. To prove the second one,

min ( G 1 X + d P , G 1 X d P ) min ( Ω X + d P , Ω X d P ) = min ( E [ X + ] , E [ X ] ) < , {\displaystyle {\begin{aligned}\min \left(\int _{G_{1}}X_{+}\,d\operatorname {P} ,\int _{G_{1}}X_{-}\,d\operatorname {P} \right)&\leq \min \left(\int _{\Omega }X_{+}\,d\operatorname {P} ,\int _{\Omega }X_{-}\,d\operatorname {P} \right)\\[4pt]&=\min(\operatorname {E} [X_{+}],\operatorname {E} [X_{-}])<\infty ,\end{aligned}}}

so the integral G 1 X d P {\displaystyle \textstyle \int _{G_{1}}X\,d\operatorname {P} } is defined (not equal {\displaystyle \infty -\infty } ).

The second property thus holds since G 1 G 1 G 2 {\displaystyle G_{1}\in {\mathcal {G}}_{1}\subseteq {\mathcal {G}}_{2}} implies

G 1 E [ E [ X G 2 ] G 1 ] d P = G 1 E [ X G 2 ] d P = G 1 X d P . {\displaystyle \int _{G_{1}}\operatorname {E} [\operatorname {E} [X\mid {\mathcal {G}}_{2}]\mid {\mathcal {G}}_{1}]d\operatorname {P} =\int _{G_{1}}\operatorname {E} [X\mid {\mathcal {G}}_{2}]d\operatorname {P} =\int _{G_{1}}Xd\operatorname {P} .}

Corollary. In the special case when G 1 = { , Ω } {\displaystyle {\mathcal {G}}_{1}=\{\emptyset ,\Omega \}} and G 2 = σ ( Y ) {\displaystyle {\mathcal {G}}_{2}=\sigma (Y)} , the smoothing law reduces to

E [ E [ X Y ] ] = E [ X ] . {\displaystyle \operatorname {E} [\operatorname {E} [X\mid Y]]=\operatorname {E} [X].}

Alternative proof for E [ E [ X Y ] ] = E [ X ] . {\displaystyle \operatorname {E} [\operatorname {E} [X\mid Y]]=\operatorname {E} [X].}

This is a simple consequence of the measure-theoretic definition of conditional expectation. By definition, E [ X Y ] := E [ X σ ( Y ) ] {\displaystyle \operatorname {E} [X\mid Y]:=\operatorname {E} [X\mid \sigma (Y)]} is a σ ( Y ) {\displaystyle \sigma (Y)} -measurable random variable that satisfies

A E [ X Y ] d P = A X d P , {\displaystyle \int _{A}\operatorname {E} [X\mid Y]d\operatorname {P} =\int _{A}Xd\operatorname {P} ,}

for every measurable set A σ ( Y ) {\displaystyle A\in \sigma (Y)} . Taking A = Ω {\displaystyle A=\Omega } proves the claim.

Proof of partition formula [edit]

i E ( X A i ) P ( A i ) = i Ω X ( ω ) P ( d ω A i ) P ( A i ) = i Ω X ( ω ) P ( d ω A i ) = i Ω X ( ω ) I A i ( ω ) P ( d ω ) = i E ( X I A i ) , {\displaystyle {\begin{aligned}\sum \limits _{i}\operatorname {E} (X\mid A_{i})\operatorname {P} (A_{i})&=\sum \limits _{i}\int \limits _{\Omega }X(\omega )\operatorname {P} (d\omega \mid A_{i})\cdot \operatorname {P} (A_{i})\\&=\sum \limits _{i}\int \limits _{\Omega }X(\omega )\operatorname {P} (d\omega \cap A_{i})\\&=\sum \limits _{i}\int \limits _{\Omega }X(\omega )I_{A_{i}}(\omega )\operatorname {P} (d\omega )\\&=\sum \limits _{i}\operatorname {E} (XI_{A_{i}}),\end{aligned}}}

where I A i {\displaystyle I_{A_{i}}} is the indicator function of the set A i {\displaystyle A_{i}} .

If the partition { A i } i = 0 n {\displaystyle {\{A_{i}\}}_{i=0}^{n}} is finite, then, by linearity, the previous expression becomes

E ( i = 0 n X I A i ) = E ( X ) , {\displaystyle \operatorname {E} \left(\sum \limits _{i=0}^{n}XI_{A_{i}}\right)=\operatorname {E} (X),}

and we are done.

If, however, the partition { A i } i = 0 {\displaystyle {\{A_{i}\}}_{i=0}^{\infty }} is infinite, then we use the dominated convergence theorem to show that

E ( i = 0 n X I A i ) E ( X ) . {\displaystyle \operatorname {E} \left(\sum \limits _{i=0}^{n}XI_{A_{i}}\right)\to \operatorname {E} (X).}

Indeed, for every n 0 {\displaystyle n\geq 0} ,

| i = 0 n X I A i | | X | I i = 0 n A i | X | . {\displaystyle \left|\sum _{i=0}^{n}XI_{A_{i}}\right|\leq |X|I_{\mathop {\bigcup } \limits _{i=0}^{n}A_{i}}\leq |X|.}

Since every element of the set Ω {\displaystyle \Omega } falls into a specific partition A i {\displaystyle A_{i}} , it is straightforward to verify that the sequence { i = 0 n X I A i } n = 0 {\displaystyle {\left\{\sum _{i=0}^{n}XI_{A_{i}}\right\}}_{n=0}^{\infty }} converges pointwise to X {\displaystyle X} . By initial assumption, E | X | < {\displaystyle \operatorname {E} |X|<\infty } . Applying the dominated convergence theorem yields the desired result.

See also [edit]

  • The fundamental theorem of poker for one practical application.
  • Law of total probability
  • Law of total variance
  • Law of total covariance
  • Law of total cumulance
  • Product distribution#expectation (application of the Law for proving that the product expectation is the product of expectations)

References [edit]

  1. ^ Weiss, Neil A. (2005). A Course in Probability. Boston: Addison–Wesley. pp. 380–383. ISBN0-321-18954-X.
  2. ^ "Law of Iterated Expectation | Brilliant Math & Science Wiki". brilliant.org . Retrieved 2018-03-28 .
  3. ^ "Adam's and Eve's Laws". Retrieved 2022-04-19 .
  4. ^ Rhee, Chang-han (Sep 20, 2011). "Probability and Statistics" (PDF).
  5. ^ Wolpert, Robert (November 18, 2010). "Conditional Expectation" (PDF).
  • Billingsley, Patrick (1995). Probability and measure. New York: John Wiley & Sons. ISBN0-471-00710-2. (Theorem 34.4)
  • Christopher Sims, "Notes on Random Variables, Expectations, Probability Densities, and Martingales", especially equations (16) through (18)

crislerproured.blogspot.com

Source: https://en.wikipedia.org/wiki/Law_of_total_expectation

0 Response to "Iterated Expectation for Continuous Random Variables"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel