Stat 61 - Homework #7   (11/5/07,   due 11/12/07)

1.  Estimating a Proportion.  Each time we operate this machine, it either works (success) or it doesn’t (failure).  Each attempt is independent.  We want to estimate the single-trial success probability  p.

Our data are the results of  n  trials.  Of these,  k  are hits.  One could think of the data as the values of  n  independent random variables  X1, X2, …, Xn,  each having a Bernoulli (zero-one) distribution with parameter  p.

If actual data helps, use  n = 100  and  k = 30.  But I don’t remember the sequence of successes and failures, so I can’t tell you the observed values of the Xi’s.

a.  Express this as a method-of-moments problem.  That is, express  p  as an unknown parameter of some probability distribution, and compare the theoretical mean of the distribution to the observed mean of the data.  What is the method-of-moments estimator for  p ?

b.  Set this up, somehow, as a least-squares problem.  (Maybe this way:  If you estimate  p,  you’re saying that all of the Xi’s “should” have value  p.  But they actually have values of 0 and 1.  Treat the differences as errors.)  What is the least-squares estimator for  p ?

c.  What is the maximum-likelihood estimator for  p ?

d.  Do any of your estimators depend on the (forgotten) sequence of the hits and misses?

NOTE:  Some of you, observing a coin that produced 3 heads in 5 flips, insisted on estimating 0.50 instead of 0.60 for the coin’s long-term rate of heads.  Apparently you were rejecting method-of-moments, least squares, and maximum likelihood all at once!  So there must be more to estimation than these three approaches.

2.  Least Squares.  Engineering students take  3  measurements of the distance between two benchmarks on the Swarthmore campus.  In meters, the measurements are:

75.005             75.011             75.014

The parameter we care about is D, the true distance.

Our model for the measurements is that each measurement  yi   is given by

yi = D + ei

where  ei’s  are independent measurement errors.  From experience with other engineering students (and other benchmarks) we know that measurement errors are normally distributed with mean 0 and standard deviation 0.009, so their density function is

fe(e) = .                                                    (1)

Equivalently, the measurements themselves are normal with mean D and s=0.009, giving density

fy(y) = .                                                    (2)

(Note that (1) and (2) are really the same model, looked at in slightly different ways.  The density for the measurements depends on D.)

Note that if you make an estimate for D, you’re also implicitly estimating the values of all the measurement errors.  For example, if your estimate is D=75.0, you’re implying that the measurement errors are, respectively,

+0.005,   +0.011,   +0.014.

The measurement errors are unknown parameters too, and in this example, knowing one of the parameters means knowing all of them.

a.  The least-squares estimator for  D  is the estimate that minimizes the sum of the squares of the implied measurement errors.  That is, choose D to minimize

E  =  E(D)  =  (y1-D) +   (y2-D) +  (y3-D).             (3)

What is the least-squares estimator for  D ?  What is the estimate in this case?  (The estimator should be a function of y1, y2, y3.  The estimate should be a number.)

b.  What is the maximum-likelihood estimator for D ?

3.  Weighted least squares.  Some people like to use “weighted least squares,” which in the context of the previous problem means minimizing

E  =  w1 (y1-D) +   w2 (y2-D) +  w3 (y3-D)2                         (4)

instead of (3) above.  Here the weights are all positive, and somehow express the importance of the various measurements.  (The sum of the wi’s doesn’t matter.)

What is the weighted least squares estimator for D?  (Your answer will be a function of the wi’s and the yi’s.)

4.  Different instruments.  Again, assume the situation of problem 2.  But this time, the third measurement was made with a better instrument, so it has a standard deviation of    0.006   instead of   0.009.

a.  Write the revised likelihood function.

b.  Now, what is the maximum-likelihood estimator for D ?

c.  If this can be expressed as a weighted least-squares estimator, what are the weights?  How do they depend on the standard deviations of the measurements?

5.  Estimating the parameters of a uniform distribution.  Assume that b > a.  A random variable  Y  has a uniform distribution with limits a and b.  That is, it has a density function given by    fY(y) = 1/(b-a)    when   a £ x £ b

(zero otherwise).

This distribution has mean   m = (a + b)/2  and standard deviation  (ba )/ (which means that the variance is  (ba)2/12 ).

Suppose we make random draws y1, y2, y3, y4 from this distribution.  Assume that these have been rearranged so that  y1 £ y2 £ y3 £ y4.  If you like, assume that they are:

4.9,  8.2,  9.1,  10.6.

a.  Think of a reasonable estimator for the mean, m.  Try to find one that is unbiased and has as small a standard error as possible.  In any case, say what you can about this estimator’s properties.

b.  Think of a reasonable estimator for the upper limit, b.  Same goals; say what you can.

(end)