Pitman MTH 135/ STA 104 Probability Week 4
Read:  Pitman sections 2.4-2.5

Poisson & Hypergeometric Distributions

* Poisson Approximation to the Binomial
-----------

Shortcomings of CLT:  if p is close to 0 or 1, then \sqrt{n p q} is small.
Normal is symmetric, for example, while binomial for small p is packed up
against 0.

Text looks at binomial with n large and mu=1, i.e., p=1/n; I can do fishing
example, and sneak in exponential distribution.

Recall exponential sum, \sum_{k=0}^oo   a^k/k!  =  exp(a)

---------------

What's the mode?

Is there ever a double-mode?

                            k-1              k
                          mu              mu
  P(k-1) < P(k)  <==>  -----------  <  ----------
                         (k-1) !          k !


                                           mu
                 <==>       1       <  ----------
                                            k

                 <==>       k < mu

SO, mode is largest integer <= mu; if mu is an integer, then (mu-1, mu) is a
double mode.

----------------

What happens when mu gets big?

Looks like the normal, with mean mu and variance sigma^2 = mu.

=========================================================================

* Random Sampling:  WITH and WITHOUT Replacement


WITH replacement:  Binomial,  mu = G/(G+B)   ("good" and "bad" in
                                                       population), so

      P[g good, b bad in n=b+g tries] =  (n:g) G^g  B^b  / N^n
      (here "(n:k)" denotes the binomial coeffient "n choose k")

WITHOUT replacement:
      Two approaches:
       one-at-a-time:
                                     G   G-1     G+1-g    B   B-1       B+1-b
                             (n:g) ---- --- ... ----- x ---- ----- ...------ 
                                     N   N-1     N+1-g    N-g N-g-1     N+1-n


                                n!      G!       B!    (N-n)!
                           =  ------ -------- -------- ------
                              g! b!   (G-g)!   (B-b)!    N!

and "numbers of subsets" approach:

                             (G:g)    (B:b)
                           --------------------
                                  (N:n)

                                G!         B!      (N-n)! n!  
                           =  -------- ---------- -----------
                              g!(G-g)!  b! (B-b)!      N!

=========================================================================
GEOMETRIC DIST'N:

Count the failures X before the first success, in independent trials with
probability p of success.  Find the probability of exactly k failures; also
find the average number of failures before 1st success.

For k failures before 1st succ, first k+1 trials must be EXACTLY:

             F F F ... F S

whose probability is   f(k) = p (1-p)^k,    k = 0,1,2,...

Average ("mean") value of this is

           sum(k=0:oo)  k  p q^k  =  ???

Trick:

       = p * q *  sum(k=0:oo)  k  q^(k-1)

       = p * q * (d/dq) sum(k=0:oo)  q^k

       = p * q * (d/dq) [ 1 / (1-q) ]

       = p * q * [ 1 / (1-q)^2 ]

       = p * q / p^2 =  q/p

Example:  E[ Non-aces before 1st Ace ] = 5

========================================================================

NEGATIVE BINOMIAL DIST'N:

Count the failures Y before the r'th success, otherwise as above;

For k failures before rth succ, first k+r trials must be EXACTLY:

             F S S F F S ...      F S
              \ r-1 S's and k Fs /

whose prob is (k+r: k) p^r q^k

May view it as sum of r Geometrics--- failures before 1st succ +
add'l failures before 2nd succ + ... + failures before rth succ.

SO, E[ Y ] + r*p.

Note dist'n makes sense for ALL real r>0, not only integers:

         r (r+1) (r+2)...(r+k-1)                  Gamma(r+k)
  f(k) = -----------------------  p^r  (1-p)^k  = ----------- p^r q^k
                   k !                            Gamma(r) k!

  Mean will be mu = r q / p

========================================================================

Indicators:

  Binomial:   X = I1 + I2 + ... + In

              Each Ij = 1 w/prob p, 0 w/prob q, ==> Mean p;
              Sum  has mean n*p.

  Hypergeo:   X = I1 + I2 + ... + In
              Each Ij = 1 w/prob p, 0 w/prob q, ==> Mean p;
              Sum  has mean n*p.

  How are they different?

  VARIANCE:

        sig^2 =  E | X - mu |^2 

              =  E ( I1 + ... + In - n*p )^2

              =  E ( I1 - p)^2 +  ( I2 - p)^2 + ... + ( In - p)^2
               + E ( I1-p )(I2-p) + ...
              =  n * p * (1-p) + n*(1-n) * [ E Ii Ij - p^2 ]

         What IS E[ Ii Ij - p^2 ] ???

  Binomial:   p^2 - p^2 = 0, so var is n p q.

  HyperGeo:   (G/N)  ( (G-1)/(N-1) ) - (G/N)^2

              = (G/N) [ (G-1)/(N-1) - G/N ]
              = - (G/N) [ (N-G) / N(N-1) ]
              = - (G/N)(B/N) / (N-1)

       so var = n(G/N)(B/N) - n(n-1)(G/N)(B/N) / (N-1) 
              = n P Q [ 1 - (n-1)/(N-1) ]
              = n P Q [ (N-n) / (N-1) ]

  which is pretty much the same as Binomial IF N >> n

=========================================================================

Poi Mean:

       sum_{k=0:oo}   k  (lam^k)/k! exp(-lam)

         =  sum_{k=1:oo}   k  (lam^k)/k! exp(-lam)
         =  sum_{k=1:oo}   lam * (lam^(k-1))/(k-1)! exp(-lam)
         =  lam * sum_{j=0:oo}   lam^j / j! exp(-lam)
         =  lam * exp(lam) * exp( -lam ) = lam.

Poi Var:

      E[ (X-lam)^2 ] is hard; so is E[ X^2 ].  BUT, easy is:

      E[ X*(X-1) ] = E[ X^2 ] - E[ X ] = lam^2

      so Var(X) = (lam^2 + lam) - (lam^2) = lam,  same as mean.

==========================================================================

St. Petersburg:

Start with $1 on the table and a coin.

At each step:  Toss the coin; if it shows Heads,
take the money.  If it shows Tails, I'll double
what's on the table.

Let X be the amount you win.

What is the pmf for X?

What is E[ X ] ???

What does that MEAN in terms of averages?

How much would you pay me to play this game?

--------