**Stat 61 - Homework #7 (11/5/07, due 11/12/07) **

**1. Estimating
a Proportion.** Each time we operate
this machine, it either works (success) or it doesn’t (failure). Each attempt is independent. __We want to estimate the single-trial
success probability p.__

Our data are the results of n
trials. Of these, k are
hits. One could think of the data as the
values of n independent random variables X_{1}, X_{2}, …, X_{n}, each having a Bernoulli (zero-one)
distribution with parameter p.

If actual data helps, use n = 100
and k = 30. But I don’t remember the sequence of
successes and failures, so I can’t tell you the observed values of the X_{i}’s.

a. Express this as a method-of-moments problem. That is, express p as an unknown parameter of some probability distribution, and compare the theoretical mean of the distribution to the observed mean of the data. What is the method-of-moments estimator for p ?

b.
Set this up, somehow, as a least-squares problem. (Maybe this way: If you estimate p,
you’re saying that all of the X_{i}’s “should” have value p. But
they actually have values of 0 and 1.
Treat the differences as errors.)
What is the least-squares estimator for
p ?

c. What is the maximum-likelihood estimator for p ?

d. Do any of your estimators depend on the (forgotten) sequence of the hits and misses?

NOTE: Some of you, observing a coin that produced 3 heads in 5 flips, insisted on estimating 0.50 instead of 0.60 for the coin’s long-term rate of heads. Apparently you were rejecting method-of-moments, least squares, and maximum likelihood all at once! So there must be more to estimation than these three approaches.

**2. Least
Squares. **Engineering students
take 3
measurements of the distance between two benchmarks on the Swarthmore
campus. In meters, the measurements are:

75.005 75.011 75.014

The parameter we care about is D, the true distance.

Our
model for the measurements is that each measurement y_{i} is given by

y_{i}
= D + e_{i}

where e_{i}’s are independent measurement errors. From experience with other engineering
students (and other benchmarks) we know that measurement errors are normally
distributed with mean 0 and standard deviation 0.009, so their density function
is

f_{e}(e)
= _{}. (1)

Equivalently, the measurements themselves are normal with mean D and s=0.009, giving density

f_{y}(y)
= _{}. (2)

(Note that (1) and (2) are really the same model, looked at in slightly different ways. The density for the measurements depends on D.)

Note that if you make an estimate for D, you’re also implicitly estimating the values of all the measurement errors. For example, if your estimate is D=75.0, you’re implying that the measurement errors are, respectively,

+0.005, +0.011, +0.014.

The measurement errors are unknown parameters too, and in this example, knowing one of the parameters means knowing all of them.

a. The *least-squares
*estimator for D is the estimate that minimizes the sum of the
squares of the implied measurement errors.
That is, choose D to minimize

E =
E(D) = (y_{1}-D)^{2 } + (y_{2}-D)^{2 } + (y_{3}-D)^{2 }. (3)

What
is the least-squares estimator for D
? What is the estimate in this
case? (The *estimator* should be a function of y_{1}, y_{2}, y_{3}. The *estimate*
should be a number.)

b. What is the maximum-likelihood estimator for D ?

**3.
Weighted least squares.** Some
people like to use “weighted least squares,” which in the context of the
previous problem means minimizing

E = w_{1}
(y_{1}-D)^{2 } + w_{2}
(y_{2}-D)^{2 } + w_{3}
(y_{3}-D)^{2} (4)

instead of (3) above. Here the weights are all positive, and
somehow express the importance of the various measurements. (The sum of the w_{i}’s doesn’t
matter.)

What is the weighted least squares
estimator for D? (Your answer will be a
function of the w_{i}’s and the y_{i}’s.)

**4.
Different instruments. ** Again,
assume the situation of problem 2. But
this time, the third measurement was made with a better instrument, so it has a
standard deviation of 0.006 instead of
0.009.

a. Write the revised likelihood function.

b. Now, what is the maximum-likelihood estimator for D ?

c. If this can be expressed as a weighted least-squares estimator, what are the weights? How do they depend on the standard deviations of the measurements?

**5.
Estimating the parameters of a uniform distribution.** Assume
that b > a. A
random variable Y has a *uniform distribution with limits **a** and **b**.*
That is, it has a density function given by

f_{Y}(y)
= 1/(b-a) when a £ x £ b

(zero
otherwise).

This distribution has mean m = (a + b)/2
and standard deviation (b – a )/_{} (which means that the
variance is (b – a)^{2}/12
).

Suppose we make random draws y_{1},
y_{2}, y_{3}, y_{4} from this distribution. Assume that these have been rearranged so
that y_{1} £ y_{2}
£ y_{3} £ y_{4}. If you like, assume that they are:

4.9, 8.2,
9.1, 10.6.

a. Think of a reasonable
estimator for the mean, m.
Try to find one that is unbiased and has as small a standard error as
possible. In any case, say what you can
about this estimator’s properties.

b. Think of a reasonable estimator for the upper limit, b. Same goals; say what you can.

(end)