Stat 11

March 5, 2006

Homework #5 - SOLUTIONS

 

Problems from Chapter 4:

 

            4.22,  4.28,  4.46,  4.58,  4.60,  4.64a.  (Do 4.64b, if you like, for extra credit.)

 

            For 4.60 and 4.64, a sensible approach is to determine  m,  s2,  and  s  separately

            for each of the two variables and then combine them.

 

4.22 (table of eight racial categories)

a. These probabilities are legitimate because they add to one.

b. P(A) = 0.000 + 0.003 + 0.060 + 0.062 = 0.125

c. P(Bc) = 1 – (0.60 + 0.691) = 0.249.  Bc is the event that the person chosen is not white.

d. This event is “Ac and B” and its probability is  P(Ac and B) = 0.691.

 

4.28. (probability that among 10, one is a universal donor)

P (at least 1 universal donor) = 1 – P (not at least 1 universal donor)

           = 1 – P (10 non-universal donors)

           = 1 – 0.9310

           = 1 – 0.4840

           = 0.516

 

4.46 (household size distributions)

            a.  P( Y > 1)   =    0.33 + 0.16 + 0.14 + 0.06 + 0.03 + 0.01 = 0.73

            b.  P( 2 < Y ≤ 4 )  =  0.16 + 0.14 = 0.3

            c.  P( Y ≠ 2)    =    0.67

 

4.58 (registered voters who said they actually voted)

a. Z1 = (0.60 – 0.56) / 0.019 = 2.10

          p1 = 0.9821

    Z2 = (0.52 – 0.56) / 0.019 = -2.10

          p2 = 0.0179

    p1 - p2 = .9642

b. Z = (.72 – 0.56) / 0.019 = 8.421

The probability that p>= .72 is (1 – the probability associated with the Z-score 8.421) which is very close to 0.

 

 


4.60

          nonword errors     

xi

pi

xi pi

(xi u)

(xi u)2

(xi u)2 pi

0

0.1

0

-2.1

4.41

0.441

1

0.2

0.2

-1.1

1.21

0.242

2

0.3

0.6

-0.1

0.01

0.003

3

0.3

0.9

+0.9

0.81

0.243

4

0.1

0.4

+1.9

3.61

0.361

mean

 

2.1

 

 

1.290

 

Mean = 2.1

Standard Deviation » 1.136

Variance = 1.290

 

          word errors

xi

pi

xi pi

(xi u)

(xi u)2

(xi u)2 pi

0

0.4

0.0

-1

1

0.4

1

0.3

0.3

0

0

0.0

2

0.2

0.4

+1

1

0.2

3

0.1

0.3

+2

4

0.4

mean

 

1.0

 

 

1.0

 

Mean = 1.0

Standard Deviation = 1.0

Variance = 1.0

 

4.64.  (statistics of the total number of word and non-word errors) 

            a. (word errors and non-word errors are independent)

                   mean = 2.1 + 1.0 = 3.1

                    variance = 1.290 + 1.000 = 2.290

                    std dev = sqrt(2.290) = 1.513

            b. (correlation = 0.50…see box on page 302)

                   mean = 3.1 as in (a)

                    variance = 1.290 + 1.000 + 2 [ 0.5 (1.136) (1.000) ]  = 3.426

                    std dev = sqrt(3.426) = 1.851

Problems related to estimating a proportion (section 5.1):

 

Some of these problems use the formula from class:

1.  A simple random sample of 100 people is selected from the 14,000 adult residents of Fort Smith, Arkansas, and those in the sample are asked whether they favor a proposed highway project.  It turns out that 55 of those in the sample say yes, they favor the project.  (The other 45 said no.  Assume that the sampling was done perfectly and that everyone selected gave an honest answer.)

 

            Let  p  represent the (unknown) fraction of adult residents that favor the project.

 

            a.  In this problem, what are   and  n ?

                    = 0.55,     n = 100

            b.  From the information given, what is the best estimate of  p ?

                   Best estimate =  = 0.55

            c.  What would you use for the standard error of that estimate?

                       

2.  Continuing the situation from problem 1…

 

            a.  Suppose the actual value of  p is 0.50.  (That is, exactly 7000 of the 14000 residents would say yes if asked.  The difference in this problem is that you know the actual value of  p.)   If thousands of samples of size 100 were done, and a value of    were computed for each sample, what would be the mean of all the values  ?  What would be the standard deviation ?

                   The mean would be  p = 0.50;  the standard deviation would be

                                   

            b.  Same questions, but now suppose that the actual value of p is 0.80.

                   The mean would be  p = 0.80;  the standard deviation would be

                                   

 

            c.  The standard deviations you gave in parts a and b are different from the standard error you gave in problem 1c.  Are you still comfortable with the answer you gave to problem 1c ?  That is, do you think it’s a good way to estimate the standard error in the situation given?

                   It’s reasonable not to be troubled by the answer in 2a, since the “correct” value of 0.500 is very close to the value of 0.497 that you used in problem 1.

                   It’s reasonable not to be troubled by the answer to 2b, even though there’s a noticeable difference between the “correct” answer of 0.400 and the value of 0.497 that you used, because, given the result of the survey in problem 1, it’s very unlikely that p is really equal to 0.80.

                   In either case, the use of  in the formula for the standard error (in place of the true value of p, which you don’t know) seems very reasonable.

 

 

 

3.  A certain university has declared that historically, 70% of its football players get degrees.  You’re skeptical, so in August, 2006, you undertake a brief investigation.  You obtain from the university an official list of the players that entered the program during calendar years 1996-2002, carefully select a random sample of 40 of these players, and determine , the fraction of the sample that got degrees.

 

            a.  Suppose that the university’s claim is true.  If you took lots of samples of size 40 and computed  for each sample, what would be the mean of these values?

                   0.70

            b. What would be the standard deviation?

                   std dev = 0.072  (use =0.70, n = 40)

            c.  What would be the shape of the distribution of  values?

                   approximately normal

            d.  What fraction of the  values would be 55% or below?

                   Z = (0.55 – 0.70) / 0.072 = -2.083, so the fraction is 0.0188

            e.  In fact, in your sample, 55%  (that is, 22 out of 40) obtained degrees.  Is the university’s claim plausible?

                   Barely plausible.  If their value is correct, you have obtained a very rare result.

            f.  Can you think of a source of bias in your survey?

                   Many possibilities.  Unless we’re careful, a few false reports by respondents could have a large effect on our results.  The issue I was thinking of was that some of the players who entered the program in, say, 2001 or 2002 and haven’t graduated may still graduate in 2006; we’re not counting them, but we should.

 

(end)

 

Thanks to Katie Altynova for providing these answers in electronic form.  I have added my own comments.  -WRS