Stat 11
March 5, 2006
Homework #5 - SOLUTIONS
Problems from Chapter 4:
4.22, 4.28, 4.46, 4.58, 4.60, 4.64a. (Do 4.64b, if you like, for extra credit.)
For 4.60 and 4.64, a sensible approach is to determine m, s2, and s separately
for each of the two variables and then combine them.
4.22 (table of eight racial categories)
a. These probabilities are legitimate because they add to
one.
b. P(A) = 0.000 + 0.003 + 0.060 + 0.062 = 0.125
c. P(Bc) = 1 – (0.60 + 0.691) = 0.249. Bc is the event that the person chosen is not white.
d. This
event is “Ac and B” and its probability is
P(Ac and B) = 0.691.
4.28. (probability that among 10, one is a universal donor)
P (at least 1 universal donor) = 1 – P (not at least 1 universal donor)
= 1 – P (10 non-universal donors)
= 1 – 0.9310
= 1 – 0.4840
= 0.516
4.46 (household size distributions)
a. P( Y > 1) = 0.33 + 0.16 + 0.14 + 0.06 + 0.03 + 0.01 = 0.73
b. P( 2 < Y ≤ 4 ) = 0.16 + 0.14 = 0.3
c. P( Y ≠ 2) = 0.67
4.58 (registered voters who said they actually voted)
a. Z1 = (0.60 – 0.56) / 0.019 = 2.10
p1 = 0.9821
Z2 = (0.52 – 0.56) / 0.019 = -2.10
p2 = 0.0179
p1 - p2 = .9642
b. Z = (.72 – 0.56) / 0.019 = 8.421
The probability that p>= .72 is (1 – the probability associated with the Z-score 8.421) which is very close to 0.
4.60
nonword errors
|
xi |
pi |
xi pi |
(xi – u) |
(xi – u)2 |
(xi – u)2 pi |
||
|
0 |
0.1 |
0 |
-2.1 |
4.41 |
0.441 |
||
|
1 |
0.2 |
0.2 |
-1.1 |
1.21 |
0.242 |
||
|
2 |
0.3 |
0.6 |
-0.1 |
0.01 |
0.003 |
||
|
3 |
0.3 |
0.9 |
+0.9 |
0.81 |
0.243 |
||
|
4 |
0.1 |
0.4 |
+1.9 |
3.61 |
0.361 |
||
|
mean |
|
2.1 |
|
|
1.290 |
|
|
Mean = 2.1
Standard Deviation » 1.136
Variance = 1.290
word errors
|
xi |
pi |
xi pi |
(xi – u) |
(xi – u)2 |
(xi – u)2 pi |
|
0 |
0.4 |
0.0 |
-1 |
1 |
0.4 |
|
1 |
0.3 |
0.3 |
0 |
0 |
0.0 |
|
2 |
0.2 |
0.4 |
+1 |
1 |
0.2 |
|
3 |
0.1 |
0.3 |
+2 |
4 |
0.4 |
|
mean |
|
1.0 |
|
|
1.0 |
Mean = 1.0
Standard Deviation = 1.0
Variance = 1.0
4.64. (statistics of the total number of word and non-word errors)
a. (word errors and non-word errors are independent)
mean = 2.1 + 1.0 = 3.1
variance = 1.290 + 1.000 = 2.290
std dev = sqrt(2.290) = 1.513
b. (correlation = 0.50…see box on page 302)
mean = 3.1 as in (a)
variance = 1.290 + 1.000 + 2 [ 0.5 (1.136) (1.000) ] = 3.426
std dev = sqrt(3.426) = 1.851
Problems related to estimating a proportion (section 5.1):
Some of these problems use the formula from class:
![]()
1. A simple random sample of 100 people is selected from the 14,000 adult residents of Fort Smith, Arkansas, and those in the sample are asked whether they favor a proposed highway project. It turns out that 55 of those in the sample say yes, they favor the project. (The other 45 said no. Assume that the sampling was done perfectly and that everyone selected gave an honest answer.)
Let p represent the (unknown) fraction of adult residents that favor the project.
a. In this problem, what are
and n ?
= 0.55, n = 100
b. From the information given, what is the best estimate of p ?
Best
estimate =
= 0.55
c. What would you use for the standard error of that estimate?
![]()
2. Continuing the situation from problem 1…
a. Suppose the actual value of p is 0.50.
(That is, exactly 7000 of the 14000 residents would say yes if
asked. The difference in this problem is
that you know the actual value of
p.) If thousands of samples of
size 100 were done, and a value of
were computed for
each sample, what would be the mean of all the values
? What would be the
standard deviation ?
The mean would be p = 0.50; the standard deviation would be
![]()
b. Same questions, but now suppose that the actual value of p is 0.80.
The mean would be p = 0.80; the standard deviation would be
![]()
c. The standard deviations you gave in parts a and b are different from the standard error you gave in problem 1c. Are you still comfortable with the answer you gave to problem 1c ? That is, do you think it’s a good way to estimate the standard error in the situation given?
It’s reasonable not to be troubled by the answer in 2a, since the “correct” value of 0.500 is very close to the value of 0.497 that you used in problem 1.
It’s reasonable not to be troubled by the answer to 2b, even though there’s a noticeable difference between the “correct” answer of 0.400 and the value of 0.497 that you used, because, given the result of the survey in problem 1, it’s very unlikely that p is really equal to 0.80.
In
either case, the use of
in the formula for the
standard error (in place of the true value of p, which you don’t know) seems
very reasonable.
3. A certain university
has declared that historically, 70% of its football players get degrees. You’re skeptical, so in August, 2006, you
undertake a brief investigation. You
obtain from the university an official list of the players that entered the
program during calendar years 1996-2002, carefully select a random sample of 40
of these players, and determine
, the fraction of the sample that got degrees.
a. Suppose that the university’s claim is
true. If you took lots of samples of
size 40 and computed
for each sample, what
would be the mean of these values?
0.70
b. What would be the standard deviation?
std
dev = 0.072 (use
=0.70, n = 40)
c. What would be the shape of the distribution
of
values?
approximately normal
d. What fraction of the
values would be 55% or
below?
Z = (0.55 – 0.70) / 0.072 = -2.083, so the fraction is 0.0188
e. In fact, in your sample, 55% (that is, 22 out of 40) obtained degrees. Is the university’s claim plausible?
Barely plausible. If their value is correct, you have obtained a very rare result.
f. Can you think of a source of bias in your survey?
Many
possibilities. Unless we’re careful, a
few false reports by respondents could have a large effect on our results. The issue I was thinking of was that some of
the players who entered the program in, say, 2001 or 2002 and haven’t graduated
may still graduate in 2006; we’re not counting them, but we should.
(end)
Thanks to Katie Altynova for providing these answers
in electronic form. I have added my own
comments. -WRS