代数の条件付き期待の直観


20

ましょう確率変数与え、確率空間であると -代数条件付き期待値である新しいランダム変数を構築できます。Ω Fμ (Ω,F,μ)ξ Ω Rξ:ΩRσ 、GF E [ ξ | G ]σGFE[ξ|G]


について考える直観は何ですか?以下の直感を理解しています。E [ ξ | G ]E[ξ|G]

(i) ここで、はイベント(正の確率)です。E [ ξ | A ] E[ξ|A]AA

(ii) ここで、は離散確率変数です。E [ ξ | η ] E[ξ|η]ηη

しかし、視覚化することはできません。私はそれの数学を理解しており、視覚化できるより単純なケースを一般化するような方法で定義されていることを理解しています。しかし、それでも私はこの考え方が役に立つとは思いません。それは私にとって不思議なオブジェクトのままです。E [ ξ | G ]E[ξ|G]


たとえば、をイベントとし。形成 -代数、によって生成された1。次いで、に等しくなるなら、そして等しいなら。換言すれば、であれば、及び if。μ A > 0 σ G = { A CΩ } A E [ ξ | G ] ω 1Aμ(A)>0σG={,A,Ac,Ω}AE[ξ|G](ω)μ A AξωA11μ(A)AξωAμ(Ac)Acξ1μ(Ac)AcξωAωAE[ξ|G](ω)=E[ξ|A]E[ξ|G](ω)=E[ξ|A]ωAωAE[ξ|G](ω)=E[ξ|Ac]E[ξ|G](ω)=E[ξ|Ac]ωAcωAc

紛らわしい部分はなので、なぜ?をに置き換える理由 かどうかによって異なりますが、をに置き換えることはできませんか?ωΩωΩE[ξ|G](ω)=E[ξ|Ω]=E[ξ]E[ξ|G](ω)=E[ξ|Ω]=E[ξ]E[ξ|G]E[ξ|G]E[ξ|A or Ac]E[ξ|A or Ac]ωAωAE[ξ|G]E[ξ|G]E[ξ]E[ξ]


注意。この質問に答える際には、条件付き期待の厳密な定義を使用してこれを説明しないでください。という事は承知しています。私が理解したいのは、条件付き期待値が計算することになっているものと、なぜ別のものの代わりに1つを拒否するかです。

回答:


16

条件付き表現について考える1つの方法は、σ代数Gへの投影としてです。σG

enter image description hereウィキメディアコモンズより

これは、平方積分可能なランダム変数について話すとき、実際に厳密に真実です。この場合、E [ ξ | G ]は、実際には、Gに関して測定可能なランダム変数からなるL 2Ω )の部分空間へのランダム変数ξの正射影です。そして実際、これは、L 2ランダム変数による近似を介して、L 1ランダム変数について何らかの意味で真であることが判明しました。E[ξ|G]ξL2(Ω)GL1L2

(参照についてはコメントを参照してください。)

1を考えるとσ -私たちは(確率過程の理論的には絶対条件である通訳)利用可能を持っているどのくらいの情報、そしてより大きな表すものとして代数をσ -代数は小さいながら、このように可能性のある結果についてのより多くの情報をより多くの可能性のあるイベントを意味し、σを-代数とは、起こりうるイベントが少ないことを意味し、したがって、起こりうる結果に関する情報が少なくなります。σσσ

したがって、突出F -measurableランダム変数ξを小さく上にσ -代数のGの値の最善の推測撮影手段ξから入手可能な、より限定された情報が与えられたGをFξσGξG

つまり、Fからの情報全体ではなく、Gからの情報のみが与えられた場合、E [ ξ | G ]は、厳密な意味で、ランダム変数ξが何であるかについて可能な限り最良の推測です。GFE[ξ|G]ξ


あなたの例に関しては、ランダム変数とその値を混乱させるかもしれないと思います。ランダム変数Xは、ドメインがイベント空間である関数です。それは数字ではありません。言い換えれば、X Ω RX { F | F Ω R }、一方のためのω ΩX ω RXX:ΩR  X{f | f:ΩR}ωΩX(ω)R

私の意見では、条件付き期待値の表記は、ランダム変数そのもの、つまり関数であるため、本当に悪いです。対照的に、ランダム変数の(通常の)期待値は数値です。ランダム変数の条件付き期待値は、同じランダム変数の期待値とはまったく異なる量です。つまり、E [ ξ | G ]E [ ξ ]と「型チェック」さえしません。E[ξ|G]E[ξ]

言い換えれば、通常の期待と条件付きの期待の両方を示すために記号Eを使用することは、表記法の非常に大きな乱用であり、それは多くの不必要な混乱をもたらします。E

以上のことはすべて、E [ ξ | G ] ωはある数(ランダム変数の値E [ ξ | G ]の値で評価ω)が、E [ ξ | Ω ]は確率変数ですが、Ω{ Ω }によって生成されたσ代数はE[ξ|G](ω)E[ξ|G]ωE[ξ|Ω]σΩ{,Ω}は自明/縮退であり、技術的に言えば、この定数確率変数の定数値はE [ ξ ]です。ここで、Eは通常の期待値を表し、条件付き期待値ではなく、したがってランダム変数ではありません。E[ξ]E

また、E [ ξ | A ]は次を意味します。技術的には上の条件にそれが唯一可能です話すσ -個々のイベントの代数ではなく、確率測度は唯一の完全に定義されていますので、σ -代数ではなく、個々のイベントに。したがって、E [ ξ | A ]E [ ξ | σ A ]σ Aのを表しσ -E[ξ|A]σσE[ξ|A]E[ξ|σ(A)]σ(A)σイベントによって生成された代数Aであり、{ A CΩ }。なお、σ A = G = σ C。つまり、E [ ξ | A ]E [ ξ | G ]およびE [ ξ | A c ]まったく同じオブジェクトを示すすべての異なる方法です。A{,A,Ac,Ω}σ(A)=G=σ(Ac)E[ξ|A]E[ξ|G]E[ξ|Ac]

最後に、ランダム変数Eの定数値E [ ξ | Ω ] = E [ ξ | σ Ω ] = E [ ξ | { Ω } ]ちょうど数であるE [ ξ ] - σ -代数{ Ω }E[ξ|Ω]=E[ξ|σ(Ω)]=E[ξ|{,Ω}]E[ξ]σ{,Ω} represents the least possible amount of information we could have, in fact essentially no information, so under this extreme circumstance the best possible guess we could have for which random variable ξξ is is the constant random variable whose constant value is E[ξ]E[ξ].

すべての定数ランダム変数であることに注意してくださいL 2つの確率変数、それらは全て自明に対して測定可能であるσ -代数{ Ω }、そう確かに、我々は一定ランダムこと持たないE [ ξ ]は直交投影ですξ主張されたように、{ Ω }に関して測定可能なランダム変数からなるL 2Ω の部分空間に。L2σ{,Ω}E[ξ]ξL2(Ω){,Ω}


2
@William I disagree with you about the use of E[ξ|A]E[ξ|A] as a ran var. Many books define E[ξ|A]E[ξ|A] to be a number, not a ran var. It is the best possible estimate of ξ|Aξ|A. This is a useful notion and highly intuitive. Disregarding it completely, just because you have a generalized notion of cond exp as a ran var is wrong from a pedagogical point-of-view. I am not confused about what a r.v. is, nor do I see how anything I wrote would lead you to thinking like that.
Nicolas Bourbaki

1
@William Thinking of cond expe as an estimate to the ran var with GG representing information, is something I have seen said before but I never gave it that much thought and tried to find a different way of visualizing cond expec. Using your suggestion, I am going to write up a simple example, and post it as an answer, for myself, and for other people. Perhaps, some people can then elaborate on my example and give a more exotic one.
Nicolas Bourbaki

1
@NicolasBourbaki I recommend that you look at p.221 of the 4th edition of Durrett's Probability - Theory and Examples. I can refer you to other sources discussing this as well. In any case, it is not really a matter of opinion -- in the most general case, a conditional expectation is a random variable, and conditioning is only done with respect to σσalgebras; conditioning with respect to an event is conditioning with respect to the σσalgebra generated by the event, and conditioning with respect to a random variable is conditioning w.r.t. the σσ-algebra generated by the RV
Chill2Macht

3
@William And I can refer you to sources which do define the cond. exep. of an event to be a real number. I do not know why you are so stuck on this point. One can define it any way, as long as the notions are not mixed up. For pedagogical reasons, teaching a class on prob. theory, and instantly jumping into the most general def., is not illuminating. In either case, it really does not matter in this discussion, and your complaint is about notation/semantics.
Nicolas Bourbaki

1
@NicolasBourbaki Chapter 5 of Whittle's Probability via Expectation gives a very good account (in my opinion) of both characterizations of conditional expectation, and explains well how each definition relates to and is motivated by the other definition. You are right that the distinction is one more of semantics. My enthusiasm for the more general definition stems (I think) from reading this chapter (5 of Whittle's Probability via Expectation), which made (I believe) good arguments about how the more general definition is in some ways easier to understand.
Chill2Macht

3

I am going to try to elaborate what William suggested.

Let ΩΩ be the sample space of tossing a coin twice. Define the ran. var. ξξ to be the num. of heads that occur in the experiment. Clearly, E[ξ]=1E[ξ]=1. One way of thinking of what 11, as an expec. value, represents is as the best possible estimate for ξξ. If we had to take a guess for what value ξξ would take, we would guess 11. This is because E[(ξ1)2]E[(ξa)2]E[(ξ1)2]E[(ξa)2] for any real number aa.

Denote by A={HT,HH}A={HT,HH} to be the event that the first outcome is a head. Let G={,A,Ac,Ω}G={,A,Ac,Ω} be the σσ-alg. gen. by AA. We think of GG as representing what we know after the first toss. After the first toss, either heads occured, or heads did not occur. Hence, we are either in the event AA or Ac after the first toss.

If we are in the event A, then the best possible estimate for ξ would be E[ξ|A]=1.5, and if we are in the event Ac, then the best possible estimate for ξ would be E[ξ|Ac]=0.5.

Now define the ran. var. η(ω) to be either 1.5 or 0.5 depending on whether or not ωA. This ran. var. η, is a better approximation than 1=E[ξ] since E[(ξη)2]E[(ξ1)2].

What η is doing is providing the answer to the question: what is the best estimate of ξ after the first toss? Since we do not know the information after the first toss, η will depend on A. Once the event G is revealed to us, after the first toss, the value of η is determined and provides the best possible estimate for ξ.

The problem with using ξ as its own estimate, i.e. 0=E[(ξξ)2]E[(ξη)2] is as follows. ξ is not well-defined after the first toss. Say the outcome of the experiment is ω with first outcome being heads, we are in the event A, but what is ξ(ω)=? We do not know from just the first toss, that value is ambiguous to us, and so ξ is not well-defined. More formally, we say that ξ is not G-measurable i.e. its value is not well-defined after the first toss. Thus, η is the best possible estimate of ξ after the first toss.

Perhaps, somebody here can come up with a more sophisticated example using the sample space [0,1], with ξ(ω)=ω, and G some non-trivial σ-algebra.


1

Although you request not to use the formal definition, I think that the formal definition is probably the best way of explaining it.

Wikipedia - conditional expectation:

Then a conditional expectation of X given H, denoted as E(XH), is any H-measurable function ( ΩRn) which satisfies:

HE(XH)dP=HXdPfor eachHH

Firstly, it is a H-measurable function. Secondly it has to match the expectation over every measurable (sub)set in H. So for an event,A, the sigma algebra is {A,AC,,Ω}, so clearly it is set as you specified in your question for ωA/Ac. Similarly for any discrete random variable ( and combinations of them), we list out all primitive events and assign the expectation given that primitive event.

Now consider tossing a coin an infinite number of times, where at each toss i, you get 1/2i, if your coin is tails then your total winnings are X=i=112ici where ci = 1 for tails and 0 for heads. Then X is a real random variable on [0,1]. After n coin tosses, you know the value of X to precision 1/2n, eg after 2 coin tosses it is in [0,1/4], [1/4,1/2], [1/2,3/4] or [3/4,1] - after every coin toss, your associated sigma algebra is getting finer and finer, and similarly the conditional expectation of X is getting more and more precise.

Hopefully this example of a real valued random variable with a sequence of sigma algebras getting finer and finer (Filtration) gets you away from the purely event based intuition you are used to, and clarifies its purpose.


I apologize, but I downvoted this question. It does not answer what I originally asked. Nor does it provide any new information that I did not know before.
Nicolas Bourbaki

What I am trying to suggest to you is you do not understand the formal definition as well as you think you do (as the other answer also suggested), so unless you work through what is unintuitive with the formal definition you will not progress.
seanv507

I understand the formal definition just fine. The questions that I asked, I know how to answer them when working from the formal definitions. The 'other answer', was trying to explain my question without using the definition of con. exp.
Nicolas Bourbaki
弊社のサイトを使用することにより、あなたは弊社のクッキーポリシーおよびプライバシーポリシーを読み、理解したものとみなされます。
Licensed under cc by-sa 3.0 with attribution required.