58.9 Conditional Probability
Here I will consider the concept of conditional probability depending on the theory of
differentiation of general Radon measures. This leads to a different way of thinking about
If X,Y are two random vectors defined on a probability space having values in
ℝp1 and ℝp2 respectively, and if E is a Borel set in the appropriate space, then
a random vector with values in
ℝp1 × ℝp2
. Thus, by Theorem
on Page 3514
, there exist
probability measures, denoted here by λX|y
such that whenever E
is a Borel
set in ℝp1 × ℝp2
Definition 58.9.1 Let X and Y be two random vectors defined on a probability
space. The conditional probability measure of Y given X is the measure λY|x in the
above. Similarly the conditional probability measure of X given Y is the measure
More generally, one can use the theory of slicing measures to consider any finite list of
, defined on a probability space with
Xi ∈ ℝpi
, and write the
following for E
a Borel set in ∏
Obviously, this could have been done in any order in the iterated integrals by simply
modifying the “given” variables, those occurring after the symbol |, to be those which
have been integrated in an outer level of the iterated integral. For simplicity,
Definition 58.9.2 Let
be random vectors defined on a probability space
having values in ℝp1,
, ℝpn respectively. The random vectors are independent if for
every E a Borel set in ℝp1 ×
and the iterated integration may be taken in any order. If A is any set of random vectors
defined on a probability space, A is independent if any finite set of random vectors from
A is independent.
Thus, the random vectors are independent exactly when the dependence on the givens
in 58.9.13 can be dropped.
Does this amount to the same thing as discussed earlier? Suppose you have three
random variables X,Y,Z. Let A = X−1
are Borel sets. Thus these inverse images are typical sets in σ
respectively. First suppose that the random variables are independent in the earlier sense.
Now letting G
it follows that
By uniqueness of the slicing measures or an application of the Besikovitch differentiation
theorem, it follows that for λX
Thus, using this in the above,
and also it reduces to
Now by uniqueness of the slicing measures again, for λ
Similar conclusions hold for λX,λY. In each case, off a set of measure zero the
distribution measures equal the slicing measures.
Conversely, if the distribution measures equal the slicing measures off sets of measure
zero as described above, then it is obvious that the random variables are independent.
The same reasoning applies for any number of random variables.
Thus this gives a different and more analytical way to think of independence of
finitely many random variables. Clearly, the argument given above will apply to any finite
set of random variables.
Proposition 58.9.3 Equations 58.9.14 and 58.9.13 hold with XE replaced by any
nonnegative Borel measurable function and for any bounded continuous function or
for any function in L1.
Proof: The two equations hold for simple functions in place of XE and so an
application of the monotone convergence theorem applied to an increasing sequence of
simple functions converging pointwise to a given nonnegative Borel measurable function
yields the conclusion of the proposition in the case of the nonnegative Borel function.
For a bounded continuous function or one in L1, one can apply the result just
established to the positive and negative parts of the real and imaginary parts of the
Lemma 58.9.4 Let X1,
,Xn be random vectors with values in ℝp1,
respectively and let g
: ℝp1 ×
× ℝpn → ℝk be Borel measurable. Then g
is a random vector with values in ℝk and if h
: ℝk →
If Xi is a random vector with values in ℝpi,i = 1,2,
and if gi
: ℝpi → ℝki, where gi is
Borel measurable, then the random vectors gi
are also independent whenever the Xi
Proof: First let E be a Borel set in ℝk. From the definition,
This proves 58.9.15
in the case when h
. To prove it in the general case,
approximate the nonnegative Borel measurable function with simple functions for which
the formula is true, and use the monotone convergence theorem.
It remains to prove the last assertion that functions of independent random vectors
are also independent random vectors. Let E be a Borel set in ℝk1 ×
and this proves the last assertion.
Proposition 58.9.5 Let ν1,
,νn be Radon probability measures defined on ℝp.
Then there exists a probability space and independent random vectors
defined on this probability space such that λXi
where this is just the
algebra and product measure which satisfies the following for measurable
Now let Xi
. Then from the definition, if E
is a Borel set in
Let ℳ consist of all Borel sets of
From what was just shown and the definition of
contains all sets
of the form ∏
where each Ei ∈
Borel sets of ℝp
. Therefore, ℳ
algebra of all finite disjoint unions of such sets. It is also clear that ℳ
is a monotone
class and so by the theorem on monotone classes, ℳ
equals the Borel sets. You could
also note that ℳ
is closed with respect to complements and countable disjoint unions
and apply Lemma 10.12.3
. Therefore, the given random vectors are independent and this
proves the proposition.
The following Lemma was proved earlier in a different way.
Lemma 58.9.6 If
i=1n are independent random variables having values in
Proof: By Lemma 58.9.4 and denoting by P the product, ∏