Pitman MTH 135/ STA 104 Probability Week 5 Read: Pitman sections 3.1-3.3 Discrete Random Variables * Introduction to Joint Distributions of Random Variables Roll a fair die until an ace (1) appears; how many non-aces do you see first? This is an example of a *RANDOM VARIABLE*, a number that depends on chance. a) What *is* a random variable? One answer: A function from the sample space to the real numbers |R Another: A number that depends on chance Secret: Usually upper-case letters from the end of the alphabet are used... so if you see X, Y, or Z, it's probably a RV Let's call the number of non-aces X. b) What questions can we ask & answer about random variables? One: P[ X < 3 ] = 1 - (5/6)^3 = 1 - 125/216 = 91/216 = .4213 Another: P[ X = 2 ] = P[ X >= 2 ] - P[ X >= 3 ] = (5/6)^2 - (5/6)^3 = 25/36 - 125/216 = 25/216 = .1157 OR = P[~A ~A A] = (5/6)(5/6)(1/6) = 25/216 = .1157 Yet Another: What would X be, on average, in lots of repeated trials? Variation: Instead of P[Ace]=1/6, count # of failures before 1st success if successes have probability p, 0
=17:
= 1 - (16/20)*(15/19)*(14/18) = 29/57 = .5088
X = max number selected; what are the possible values of X
and their probabilities?
P[X=20] = 3/20 = .1500
P[X=19] = 3 * (18/20) * (17/19) * (1/18) = 51/380 = .1342
P[X=18] = 3 * (17/20) * (16/19) * (1/18) = 34/285 = .1193
P[X=17] = 3 * (16/20) * (15/19) * (1/18) = 2/19 = .1053
P[X>=17]= .5088
P[X=x] = 3 * (x-1)*(x-2)/(20*19*18) = (x-1)(x-2)/2280,
x = 3,4,...,20
Another way: P[X=x] = (x-1:2) / (20:3) (also correct)
DEF:
A (real-valued) RANDOM VARIABLE is a (real-valued) function on the sample
space Omega.
Example: if Omega is the usual 36-point space for two rolls of a fair die,
say, { (r,g) : 1 <= r,g <= 6 } all equally-likely, then
X(r,g) = r
Y(r,g) = |r-g|
Z(r,g) = r+g
are all random variables. What is the probability that Y=1? What is that
EVENT?
DEF: The *RANGE* or a random variable is just the set of its possible
values.
The *DISTRIBUTION* of a random variable is any specification of
P[ X in A ]
for every set A... if X has only finitely-many (or countably-many)
values, the DISTRIBUTION can be specified by giving the probability
of each outcome in the range,
f(x) = P[ X = x ]
and then P[ X in A ] = sum { f(x) : x in A } is specified for every
A.
For other random variables, like "uniform" and "normal" among others,
we'll have to do something else--- we start that just after Fall
Break. It's always good enough to specify
F(x) = P [ X <= x ]
for every x; then we can work out the probability that X is in any
interval, any union of intervals, etc; more later.
If X is any random variable and g is any function, then Y = g(X) is another
random variable:
X g
Omega -----> |R ---------> |R
Actually, X could be a function from Omega to any set at all (say, "E")
and g could be a function from E to the real numbers, and we'd still be
okay.
If X is discrete with pmf f(x) = P[X=x],
What is the DISTRIBUTION of Y = g(X) ???
P[ Y = y ] = P[ g(X) = y ] = SUM { P[ X=x ]: g(x)=y }
= P[ X in g^{-1}(y) ]
-------------
Random Vectors and Joint Distributions
Draw two socks at random, without replacement, from a drawer full of
twelve colored socks:
6 black 4 white 2 purple
Let B be the number of Black socks, W the number of White socks drawn.
The *DISTRIBUTION* of B and W are easy to write down; each has only 3
values in its range, with probability table (why?). To make it easier
to compare & add numbers, I'll put everything over the same denominator
instead of our usual convention of "lowest terms":
0 1 2
B 15/66 36/66 15/66 (6:b)(6:2-b)/(12:2) (**)
W 28/66 32/66 6/66 (4:w)(8:2-w)/(12:2)
This table doesn't let us know everything--- for example, what is the
probability that we draw a matching pair? What's the probability that
we have one each of black and white socks? We don't have enough to
tell (e.g., we can't tell about the probability of a purple pair).
The *JOINT* distribution of B and W tells us the probability of every
possible PAIR (b,w) of numbers... we can present it in a formula
P(b,w) = (6:b)(4:w)(2:2-b-w)/(12:2) or in a table:
W
0 1 2
+------------------------------
0 | 1/66 8/66 6/66 || 15/66
| ||
B 1 | 12/66 24/66 0 || 36/66
| ||
2 | 15/66 0 0 || 15/66
| ||
===============================
28/66 32/66 6/66 66/66
Note that the MARGINAL SUMS are the same numbers we had before in (**);
they are called the *MARGINAL DISTRIBUTIONs* of B and W.
Now we can see the probability of a matching pair:
Black White Green
15/66 + 6/66 + 1/66 = 22/66 = 1/3.
or the probability of a black-and-white pair, 24/66 = 4/11.
-----------------------------------------------------------------------------
* EXPECTATIONS *
We can use the JOINT distribution P[ X=x, Y=y ] to find expectations
of functions of any two discrete random variables X and Y :
E[ g(X,Y) ] = SUM{ g(x,y) P[ X=x, Y=y ] }
For example, above the expectation of the PRODUCT
G(B,W) = B * W
of the numbers of Black and White socks is
E[ B * W ] = 0*0* 1/66 + 0*1* 8/66 + 0*2*6/66
+ 1*0*12/66 + 1*1*24/66 + 1*2*0
+ 2*0*15/66 + 2*1*0 + 1*2*0
= 24/66 = 4/11.
Why was that obvious already????
Note this cannot be calculated from the *marginal* distributions of
B and W--- and (in particular) it is NOT THE SAME as
E[ B ] * E[ W ] = { (36+30)/66 = 66/66 } * { (32+12)/66 = 44/66 }
= { 1 } * { 2/3 } = 2/3
DEFINITION: The *COVARIANCE* of two RVs is:
Cov(X, Y) = E[ (X-mu_x) * (Y-mu_y) ]
= E[ X*Y ] - mu_X * mu_Y
so, here, Cov(B, W) = 4/11 - 2/3 = (12-22)/33 = -10/33 = -0.30303
Let Z = a*X + b*Y + c; what are the MEAN and VARIANCE of Z ?
E[ Z ] = a*mu_X + b*mu_Y + c
VAR[ Z ] = E{ [a*(X-mu_X) + b*(Y-mu_Y) ]^2 }
= a^2 E[ (X-mu_X)^2 ]
+ 2*a*b E[ (X-mu_X) (Y-mu_Y) ]
+ b^2 E[ (Y-mu_Y)^2 ]
= a^2 sigma^2_X + b^2 sigma^2_Y + 2 a b Cov(X,Y)
For example:
VAR[ X + Y ] = sig_X^2 + sig_Y^2 + 2 Cov(X,Y)
VAR[ X - Y ] = sig_X^2 + sig_Y^2 - 2 Cov(X,Y)
======================================================================
* Conditional Distributions *
The marginal probability of w white sox in the draw is:
28/66 = 14/33 for w=0;
P[ W = w ] = 32/66 = 16/33 for w=1;
6/66 = 3/33 for w=2.
But what if we KNOW that we drew ZERO BLACK socks?
*Then* the *conditional distribution* of W would be:
1/15 for w=0;
P[ W = w | B = 0 ] = 8/15 for w=1;
6/15 for w=2.
(note it's much more likely now for W=2). More generally, for any
two discrete random variables X and Y, the *CONDITIONAL DISTRIBUTION* is
P[ X=x, Y=y ] joint pmf
P[ X = x | Y = y ] = ------------------- = ---------------
P[ Y=y ] marginal pmf
These can be used just like any other distribution to calculate, for
example, the *conditional* mean and variance (see below):
E[ X | Y=y ] = SUM { x * P[ X=x | Y=y ] }
For example,
E[ W | B=0 ] = 0*(1/15) + 1*(8/15) + 2*(6/15) = 20/15 = 4/3
E[ W^2 | B=0 ] = 0*(1/15) + 1*(8/15) + 4*(6/15) = 32/15
96 - 80
Var[ W | B=0 ] = 32/15 - (4/3)^2 = --------- = 16/45 = 0.3555556
45
-----------------------------------------------------------------------------
* INDEPENDENCE *
Two random variables X and Y are *INDEPENDENT* if their joint pmf factors:
P[ X=x , Y=y ] = P[ X=x ] * P[ Y=y ]
as the product of the marginal pmfs
( IF it factors at all as ANY product f(x) * g(y),
THEN it factors as the product of marginals. Why?)
For independent X,Y, the covariance vanishes:
Cov[ X,Y ] = E[ (X-mu_X) (Y-mu_Y) ]
= SUM { (X-mu_X) * (Y-mu_Y) * P[ X=x, Y=y ] }
= SUM { (X-mu_X) * (Y-mu_Y) * P[ X=x ] * P[ Y=y ] }
= SUM { (X-mu_X) * P[ X=x ] }
* SUM { (Y-mu_Y) * P[ Y=y ] }
= (mu_X - mu_X) * (mu_Y - mu_Y) = 0 * 0 = 0
and so the variance is simply
Var[ a*X + b*Y ] = a^2 Var[X] + b^2 Var[Y]
For a=1 and b=1 or b=-1,
Var[ X + Y ] = Var[X] + Var[Y] ***AND*** Var[ X - Y ] = Var[X] + Var[Y]
-----------------------------------------------------------------------------
* Expectations
If we draw some RV X repeatedly & independently, what will be its AVERAGE VALUE?
For example, if we roll a fair die 600 times, what will the average be? If
we denote the outcome on the i'th roll by X_i this looks like:
X_1 + X_2 + X_3 + ... + X_600
Avg = ---------------------------------
600
and it's a little hard to tell. BUT--- if instead we think of how many 1's
we will find, and how many 2's, and how many 3's and 4's and so forth, we
see the sum should be exactly
X_1 + X_2 + X_3 + ... + X_600 = 1 * (# of 1's in 600 rolls
+ 2 * (# of 2's in 600 rolls
+ 3 * (# of 3's in 600 rolls
+ 4 * (# of 4's in 600 rolls
+ 5 * (# of 5's in 600 rolls
+ 6 * (# of 6's in 600 rolls
which should be about (why?) ~~ 1 * 100 + 2 * 100 + 3 * 100
+ 4 * 100 + 5 * 100 + 6 * 100 = 2100
so the average should be about
2100
Avg ~ ----------- = 3.5
600
More generally, if we have any function g() and want to know the average
value of g(X) for a random variable X that takes each value x with
probability f(x), then in a large number N of tries the average will be
about
Sum [ g(x) * N * f(x) ]
Avg[ g(X) ] ~~ ----------------------------------- = Sum g(x) * f(x)
N
(note the N cancels top-and-bottom, so we can take the limit N->oo easily)
This is a *weighted average of g(x)*, weighted by the PROBABILTY that X=x;
the fair-die example had
g(x) = x f(x) = 1/6 for x = 1,2,3,4,5,6
----------------
DEFINITION:
The MEAN of X is E[X] = Sum { x * f(x) } (usually denoted "mu")
The EXPECTATION of g(X) is E[g(X)] = Sum { g(x) * f(x) }
Nobody will get upset if you mix up the words MEAN and EXPECTATION.
Note that mu has the same UNITS as X does--- if X is measured in feet,
meters, seconds, or fortnights then so is mu.
-----------------
On average, any RV X will be equal to its mean "mu"... but how far from mu
will X be?
* Can't measure this by the average of ( X - mu )
(that average is always zero); the points where X>mu are balanced out
by the points where X