Okay, see the stuff below, but really this week should concentrate on
the idea of expressing two normal random variables (X1,X2) in the form

       X1 = mu1 + s1 * Z1
       X2 = mu2 + s2 * (rho Z1 + a Z2),  where a^2 = 1-rho^2

SO X2|X1 ~ No( mu2+s2*rho*(X1-mu1)/s1,  (s2*a)^2 )

Probably should take mu1=mu2=0 first.
============================================================================


Multivariate Normal Variables:

Last week we saw that the variance of Xi and covariance of Xi and Xj are the
diagonal and off-diagonal entries in the matrix

      E [ (X-mu) (X-mu)' ]       (mu = E[X]; ' denotes transpose)

This is especially interesting for normally distributed random variables.

If Z is a zero-mean unit-variance normal random variable   Z ~ N(0,1) (we call
such a thing a "standard normal" random variable) and if a, b are real numbers
then 
        X = a Z + b

is also normally-distributed, with mean mu = b  and variance sigma^2 = a^2; if
we take Z to be a p-dimensional VECTOR of independent zero-mean unit-variance
normal random variables, take B to be a p-dimensional VECTOR and A a pxp
MATRIX, then the same thing happens:  The random variables

    X1 = B1 + A11*Z1 + A12*Z2 + ... + A1p*Zp
    X2 = B2 + A21*Z1 + A22*Z2 + ... + A2p*Zp
    ...
    Xp = Bp + Ap1*Z1 + Ap2*Z2 + ... + App*Zp

or, in vector notation, the components of the vector  X  = B  + A Z ,
are all normally-distributed, with mean E[X] = mu and covariance
matrix

    E[ (X-B) (X-B)' ] = E[ A Z Z' A' ] = AA'

MORALS:

    1.  If you'd like to generate normal random variables with means mu_i
        variances sigma_i^2 and covariances Cij, set Cii = sigma_i^2 and
        find any matrix A with AA' = C (kind of a square root); generate
        p independent standard normals; and set Z = mu + A Z.

    2.  For example, with p=2 and covariance r=Cov(X1,X2), we can take

            [ a  0 ]                        [ sigma1^2  r        ]
        A = |      |        and solve AA' = |                    |
            [ b  c ]                        [ r         sigma2^2 ]

                      [ a^2    ab    ]
        to find AA' = |              |
                      [ ab   b^2+c^2 ]

        and hence a = sigma1, b=r/sigma1, c=sqrt(sigma2^2 - r^2/sigma1^2):
        so

        X1 = mu1 + Z1 * sigma1
        X2 = mu2 + Z1 * (r/sigma1) + Z2 * sqrt(sigma2^2 - r^2/sigma1^2)

    3.  Want to PREDICT something?  Let's find:

        E[ X2 | X1 ]   =  mu2 + (r/sigma1) * (Z1=(X1-mu1)/sigma1)
                       =  mu2 + r/sigma1^2 * (X1 - mu1)

        "Linear Regression"; note sometimes we write the COVARIANCE r
        in terms of the CORRELATION COEFFICIENT rho = r /(sigma1*sigma2),
        whence the formula is 
        E[ X2 | X1 ]   =  mu2 + (rho * sigma2/sigma1) * (X1 - mu1) (see p.349)

    4.  Want an even EASIER way?  For most random variables,
             INDEPENDENT   ======>  UNCORRELATED
        but NOT the other way around; for NORMAL ONLY, INDEP <==> UNCORR
        AND conditional expectations are always LINEAR, SO,

        E[X2 | X1] = a + b * X1           for SOME numbers a,b; to find out,

        just make sure that the prediction error Y = (X2 - a - bX1) is
        independent of X1:

        0 = E[ (X2 - a - bX1) * (X1-mu1) ]
          = r - a*0 - b*sigma1^2    ==>  b = r/sigma1^2

        ALSO, the conditional VARIANCE is just 

               Var[ X2 | X1 ] = E[ Y^2 ] 
                              = sigma2^2 - r^2 /sigma1^2 
 
        The "explained variation" fraction of X2 is 

             r^2/(sigma1^2*sigma2^2) = rho^2,

        a number between 0 and 1 that tells how much of X2's varying can be
        attributed to its relationship with X1.

    5.  Want the joint density function for X1...Xp?  Maybe not... but if so,
        Change variables:

        f(Z1...Zp) = (2pi)^(-p/2) * exp[ - Sum (Zi)^2 /2 ]
        f(Z1...Zp) = (2pi)^(-p/2) * exp[ - Z'Z/2 ]
        f(X1...Xp) = const * exp[ - (X-mu)' C^(-1) (X-mu)/2 ]