[Return to Sta113 Home]
STA113: Lab 2 (Friday, September 10, 2004)
In this lab, we will use Matlab to calculate and plot probabilities of various discrete probability distributions we discussed in class.
At the beginning of the lab period, Floyd will go over two very short topics not covered in lecture: (1) determining a pdf from a cdf for a generic discrete random variable, and (2) reading binomial cdf tables such as those on pp. 736-8 of your text. This will take about five minutes. If you have questions on either of these topics after that short presentation, ask one of the TAs in the lab for help.
Binomial Probabilities
- Let's look at the binomial random variable with n=10 and p=.3 or
Bin(10, .3). In Matlab, to find P(y=3), type
>> binopdf(3, 10, .3)
Does it agree with
>> nchoosek(10,3)*0.3^3*(1-0.3)^7
where the command "nchoosek" returns the binomial coefficient
.
To find P(y <=3) which is the cumulative distribution function
evaluated at 3, type
>> binocdf(3, 10, .3)
Does it agree with the following?
>> sum(binopdf(0:3, 10, .3))
Try the following two commands and explain the results.
>> binocdf(3.5, 10, .3)
>> binopdf(3.5, 10, .3)
- Ex 50(b) on page 127:
find the mean and variance of Binomial random variable with n=10 and
p=0.6, and compute P(mean - sd <= y <= mean + sd) where "sd"
stands for "standard deviation".
[mean,var] = binostat(10,.6);
sd = sqrt(var);
mean + sd
mean -sd
binocdf(mean + sd, 10, .6) - binocdf(mean - sd, 10, .6)
If "mean
- sd" happens to be an integer, will the commands above give you the
right answer? (the answer is NO.) What kind of adjustments you need to
do?
- Next let's draw Binomial probability distributions for n =10
and p = 0.3, 0.5, 0.7. We will draw them in the same
Figure window. Note the symmetry of the distribution for p=0.5
and the skewness for p=.3 and p=.7.
clf % erase the contents of a Figure window without closing it.
y = binopdf(0:10, 10, 0.3);
subplot(2,2,1); % subdivide the current Figure window into a 2-by-2
% array of plotting area and chooses the 1st area to
% be active.
stem(0:10, y);
legend('Bino n=10, p=0.3')
y = binopdf(0:10, 10, 0.5);
subplot(2,2,2); % choose the 2nd area to be active.
stem(0:10, y);
legend('Bino n=10, p=0.5')
draw the Binomial probability distribution
with p=0.7 in the 3rd plotting area.
Draw the cumulative distribution function for Bin(10, .3)
x=0:10;
y = binocdf(x, 10, 0.3);
stairs(x, y);
legend('Cumulative Distribution Function for Bin(10, .3)', 4);
Geometric Distribution
Let's draw Geometric probability distributions for p=0.3 and p=0.7.
y1 = geopdf(0:9, 0.3);
y2 = geopdf(0:9, 0.7);
h1 = stem(1:10, y1, 'o');
% h = stem(...) returns handles to three line graphics objects:
% h(1) - the marker symbol at the top of each stem
% h(2) - the stem line
% h(3) - the base line
hold on; % add new plot to the existing axes
h2 = stem(1:10, y2, 'x');
set(gca, 'XLim', [0.5 10.5]);
legend([h1(1),h2(1)], {'p=0.3','p=0.7'})
title('Geometric Probability Distributions');
hold off; % release the current Figure window for new plots
Summary of Commands for Discrete Distributions
Today we learned three kinds of Matlab commands related with discrete
distributions. They are commands for Probability Density Functions
(ending with pdf), for Cumulative Distribution Functions (ending with
cdf) and for Mean and Variance (ending with stat). The
corresponding commands for each distribution are listed below:
(The following three distributions will be covered in future classes and labs.)
- Poisson
poisspdf, poisscdf, poisstat
- Hypergeometric
hygepdf, hygecdf, hygestat
- Negative Binomial
nbincdf, nbinpdf, nbinstat
Go to help page for more details about these commands.
Check Your Mastery of These Distributions
Work out solutions to the following problems using MatLab, referring back to the earlier sections in this web page as little as possible. They are designed to check your understanding of the two distributions covered in this lab,to check your skills in using MatLab to work with the distributions, and also to foreshadow future topics in STA113. You do not need to turn your answers in. These problems are for you to check your own mastery and understanding. If you have difficulty answering the questions, ask one of the TAs for help.
- A class of students is interested in whether anyone in the class can tell the difference between Coke and Pepsi by taste. They conduct a study in which each student is presented with three identical-looking cups labeled A, B, and C. One of the cups contains one drink and the other two contain the other drink. The drink that occurs only once is selected at random, as is the cup it goes in. The students are to taste the three drinks and identify which cup contains a different drink from the other two. They do not need to name the drinks. There are 30 students in the class. Let X (a random variable) be the number of students who correctly identify the cup containing the different drink.
- What is the set of all possible values that X might take?
- Assuming that in fact no one in the class can really tell the difference between Coke and Pepsi, what would the distribution of X be? (Identify it by its name along with its defining parameters.)
- Use MatLab to make a graph of the distribution (pdf) you just identified.
- If, after conducting the study, X turned out to be an atypically high value for this distribution, what would you conclude? Based on the graph, what values of X would you consider to be atypically high for this distribution?
- Suppose that after conducting the study, 18 students correctly identified the different drink. What would you conclude? How many standard deviations away from the mean is 18 (under the distribution you've been working with)? What is the probability that at least 18 students would correctly identify the different drink if everyone was really just guessing randomly?
- A young, healthy couple who are trying to have a baby have a probability of about 1 in 6 of getting pregnant during any given month in which pregnancy has not already occurred. Let X (a random variable) be the number of months that have passed before the woman gets pregnant. (So X = 0 would mean she got pregnant during the first month.)
- What is the distribution of X? (Identify it by its name along with its defining parameter.)
- Use MatLab to make a graph of the cumulative distribution (cdf) of the distribution you just identified over a period of 48 months. What do the axes of this graph represent? How would you interpret the graph?
- A young healthy couple who are sexually active but are trying not to have a baby might use a condom, which the Family Planning Council estimates to have approximately a 90% effectiveness rate. This means that the probability of preganancy on any given month is reduced by 90% from 1 in 6 to 1 in 60. Use MatLab to estimate the probability that a young couple using a condom regularly would have conception occur within the first two years (24 months) of becoming sexually active.
- Make a cumulative distribution graph of X (again, the number of months before pregnancy occurs) for such a couple who are regularly using a condom, again over a period of 48 months.
- Based on your cdf plot, what do you estimate to be the median time to pregnancy for a sexually active young couple using a condom regularly? (The median time to pregnancy would be the earliest time at which there is at least a 50% chance for pregnancy to have occurred.) You might find the MatLab command grid on useful in locating the median.