November 24, 2015
Conditional probabilities
Bayes' theorem and simple calculations
Introduction to Bayesian inference
You have 100 emails in your inbox: 60 are spam, 40 are not. Of the 60 spam emails, 35 contain the word "free". Of the rest, 3 contain the word "free". If an email contains the word "free", what is the probability that it is spam?
\[ P(spam~|~free) = \frac{\#~spam~\&~free}{\#~free} = \frac{35}{35+3} = 0.92 \]
\[ P(A~|~B) = \frac{P(A~and~B)}{P(B)} \]
What's the chance of winning? What is the probability of getting an outcome \(\ge\) 4 when rolling a 6-sided die? What is the probability when rolling a 12-sided die?
6-sided: \(\frac{1}{2}\), 12-sided: \(\frac{3}{4}\)
Pick the "good" die. You're playing a game where you win if the die roll is \(\ge\) 4. If you could get your pick, which die would you prefer to play this game with, 6 or 12-sided?
12-sided (the "good" die)
| Decision | Truth: R good, L bad | Truth: R bad, L good |
|---|---|---|
| Pick R | You get the candy! | You lose the candy :( |
| Pick L | You lose the candy :( | You get the candy! |
You have no idea if I have chosen the die on the left (L) to be the good die (12-sided) or bad die (6-sided). Then, before we collect any data, what are the probabilities associated with the following hypotheses?
| Choice (L or R) | Result (W or L) | |
|---|---|---|
| Roll 1 | ||
| Roll 2 | ||
| Roll 3 | ||
| … |
What is your decision? How did you make this decision?
What is the probability, based on the outcome of the first roll, that R is the good die (and L is the bad die)?
What is the probability, based on the outcome of the first two rolls, that R is the good die (and L is the bad die)?
http://www.cancer.org/cancer/cancerbasics/cancer-prevalence
http://ww5.komen.org/BreastCancer/AccuracyofMammograms.html
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1360940
Note: These percentages are approximate, and very difficult to estimate.
Prior to any testing and any information exchange between the patient and the doctor, what probability should a doctor assign to a female patient having breast cancer?
When a patient goes through breast cancer screening there are two competing claims: patient had cancer and patient doesn't have cancer. If a mammogram yields a positive result, what is the probability that patient has cancer, i.e. what is the posterior probability of having cancer if mammogram yield a positive result?
Suppose this patient who got a positive result in the first test wants to get tested again. What should the new prior probability that this patient has cancer? Is this probability smaller, larger, or equal to the prior probability in the first test? Why?
If this patient tests positive in the second test as well, will the posterior probability of her having cancer be higher or lower (or equal to) the earlier posterior probability we calculated?
What is the posterior probability of having cancer if this second mammogram also yielded a positive result?
We have done a bunch of hand calculations so far. How can we use computation in this paradigm?