Williams College Department of Mathematics

Math 342, Spring 1999

Visual Investigations:

Strategies for Exploring Data with Statistical Graphics

Class meetings

Bronfman 103, MWF 12:00—12:50 pm

Instructor

Steve C. Wang, Visiting Assistant Professor

Bronfman 208, x4517

*http://www.williams.edu/Mathematics/swang*

*swang@williams.edu*

Prerequisites

Mathematics 143, 243, or 244, or permission of the instructor.

Course description

Take a complex dataset with many variables – perhaps dealing with science, finance, psychology, or even baseball. What can you learn from this dataset? How can you reveal the patterns and discover the stories within it? Often, the best way to start is to make a picture – a statistical graphic.

Statistical graphics are essential to many aspects of analyzing data; exploring relationships, visualizing trends, formulating and investigating hypotheses. In this course, we will discuss graphical methods in statistics and apply them to real datasets. We will also address related questions: Which types of graphics convey information most effectively in different situations? When can pictures mislead us, as in the proverbial "lying with statistics"? How can we use graphics in statistical "detective work" to reveal errors in data collection and analysis?

The course will emphasize student analyses and presentations, and is suitable for mathematics majors, social and natural science majors, or anyone interested in the subject.

Course texts

[EGD] William S. Cleveland: *The Elements of Graphing Data*

[VDQI] Edward R. Tufte: *The Visual Display of Quantitative Information*

The first text has been ordered at Water Street Books. To purchase the second text, you can order on the web at *http://www.amazon.com* for a discounted price.

Additional references

Freedman, Pisani, Purves: *Statistics* and David S. Moore: *The Basic Practice of Statistics – *good references for introductory statistical concepts and methods

Neter, Wasserman, Kutner: *Applied Linear Statistical Models* and Sanford Weisberg: *Applied Linear Regression* – good references for regression

William S. Cleveland: *Visualizing Data* and Chambers, Cleveland, Kleiner, Tukey: *Graphical Methods for Data Analysis* – similar in spirit to [EGD] but more focus on statistical methods

Edward R. Tufte: *Envisioning Information* and *Visual Explanations* – similar to [VQDI], but more artistic and less statistical

Howard Wainer: *Visual Revelations* – many nice examples and case studies

Stephen M. Kosslyn: *Elements of Graph Design – *addresses graphics from the viewpoint of a cognitive psychologist

Course Requirements and Grading

Your final letter grade is based on the following criteria, with a possible half-letter adjustment based on class participation and other factors.

**Homework** (50%) will be assigned approximately weekly and is due at 12:00 noon on the specified date. Late homework will not be accepted due to class presentations (see below), with one exception: you may hand in an assignment late (by 12:00 noon on the next day of class) *once* during the semester if you are not scheduled to present or discuss that week’s homework. If you are planning to hand in a homework late, you must notify me by 12:00 noon on the original due date.

Your homework assignments must be typed (e.g., laser-printed). Try to integrate text and graphics as much as possible; a picture should be close to the text discussing it. Please do *not* put all your graphics and tables in an appendix at the end of your written material. Homework grades will be based roughly equally on statistical content and clarity of presentation.

For some homework assignments, we will have **class presentations and discussions** (10%). A nice aspect of this material is that there is often no unique "right answer" (although, of course, some answers are better than others!). The class presentations are intended to give you a chance to show your approaches and analyses to the class, and to get some feedback. The format will be similar to that at a professional conference: the presenter will talk about his or her analyses for 15—20 minutes, and then a discussant will respond for 5—10 minutes, with some time afterwards for questions from the audience. Presentation grades will be given by the class and by me.

There will be two **take-home exams** (10% each). These open-book exams will be similar to the homework assignments, except that collaboration will not be allowed.

The **final project** (20%) will be a paper (roughly 10 pages) applying methods presented in this course to any topic or dataset of your choice. You may work alone or with one other student; in the latter case, both students will receive the same grade.

Honor Code

I encourage you to discuss homework problems with other students (and with me, of course), but your final answer must be written by yourself, in your own words. Copying or paraphrasing someone else’s paper is not acceptable. All computer output you submit must come from work that you have done yourself; handing in output from someone else’s computer session is not acceptable. Collaboration of any sort on the exams is prohibited. Please refer to the statement on "Academic Honesty and the Honor Code" in the *Student Handbook* for further information. If you have any questions about how the Honor Code applies to your work in this course, please feel free to ask me.

Your comments and suggestions

I always welcome your comments or suggestions. Please feel free to tell me your opinions about any aspect of the course. I also have a suggestion box on my office door so you can drop me an anonymous note. Don’t hesitate to let me know how I can make the course better for you!

Email: scwang@swarthmore.edu

Return to the Math 342 page.

Return to Steve Wang's home page.