**E27: Computer Vision** **ENGR 027/CPSC 072** **Spring 2019** **[Matt Zucker](http://swarthmore.edu/NatSci/mzucker1)** | Lecture: | Tue/Thu 9:55-11:10AM, Hicks 211 | |---------------|-------------------------------------------| | Office Hours: | Mon 10-11AM, Wed 2:30-4PM, Hicks 202 | | Lab help: | Tue 4-5PM, Hicks 212 | This class is about applying mathematical theory to endow computers with the ability to understand and interact with the real world through physical imaging sensors. Although there will be plenty of programming, coding is not the main focus. **If you are looking for a class about programming particular APIs (e.g. OpenCV, TensorFlow), you may be disappointed!** The course is divided into three broad areas of investigation: * *Appearance based methods* including filtering, morphological operators, convolutions, frequency domain methods, edge and feature detection, correlation, and template tracking. * *Probabilistic and learning based* methods such as classification, object recognition, and clustering. * *3D geometry based methods* including multiple view geometry, structure from motion, visual odometry, stereo and structured light, and shape from shading. # Requirements **Prerequisites:** Either ENGR 19 or CPSC 35. MATH 27 or 28 is strongly recommended. **Skills:** In practice, I expect you to understand elementary programming concepts, including basic loops, functions, and array processing. I also expect you to be comfortable with [linear algebra concepts](http://www.swarthmore.edu/NatSci/mzucker1/linalg-reintroduction.pdf) such as solving linear systems, matrix inverses, rank, and eigenvalues/vectors. We will also be using related geometric concepts such as the dot product and vector norms as well as rotations and translations. **Time:** I expect students to spend approximately 8 hours per week on this class (4 classes × 8 hours per class + [8 hours for paid student work](https://www.swarthmore.edu/student-employment/employment-faqs) = 40 hours). Although this figure will vary from individual to individual and week to week, you should plan to commit several hours outside of class to homework, reading, and projects each week. # Resources **Textbook**: Richard Szeliski, *Computer Vision: Algorithms and Applications,* Springer 2010-11. [Available free online from the author.](http://szeliski.org/Book/) **OpenCV Documentation**: We will be using OpenCV version 3, and it can be hard to find Python documentation for this. Here is the [OpenCV 3.0 beta reference manual](https://docs.opencv.org/3.0-beta/modules/refman.html), with Python functions included. **Piazza**: We will use [this Piazza group](http://piazza.com/swarthmore/spring2019/engr027cpsc072) throughout the semester to communicate course announcements and answer questions. Please use Piazza (instead of just emailing me) for all course-related communications -- this allows students to see common problems and to engage in discussions about course material. **Wizards**: The class will have a weekly Wizard session to discuss homeworks (primarily) and projects. Details TBA on Piazza. # Assignments Homework consisting of math, short answer questions, and small programing exercises will be assigned weekly. There will be several larger projects/labs which are both more open ended and more programming intensive, as well as a self-directed final project. Projects and labs are self-scheduled. The course has a midterm exam and a final exam (cumulative, but biased towards the second half of the course). Grading will follow approximately the divisions shown below: * Homework: 30% * Projects/labs: 35% * Midterm exam: 15% * Final exam: 15% * Participation: 5% ## Collaboration and attribution * Feel free to collaborate with your classmates on homework; however, you must submit your own work. Duplicating others’ assignments verbatim (especially code!) is prohibited. * If you do discuss homework with your classmates, I expect you to disclose any such collaboration clearly in your submitted work. Err on the side of caution – it’s the best way to avoid awkward conversations about suspicious similarities between assignments. * Cite any external sources used, including the textbook, internet, discussions with other professors, etc. * Aside from raising technical and procedural questions on the course Piazza, do not collaborate on projects with others outside your group. * Do not post homework or project solutions on Piazza. Questions or answers that discuss solutions too closely will be deleted. Aside from the course-specific policies above, you are expected to understand and abide by the college's [policy on academic misconduct](https://www.swarthmore.edu/student-handbook/academic-policies#academic_misconduct). ## Late policy Homework will generally be assigned on Thursday, and due at the start of class the following Thursday. Homework assignments may be turned in up to a week late for half credit. Students get one free late homework turn-in without penalty. Late projects or absences from quizzes which have not been excused in advance may be strongly penalized. I will try to accommodate you in extraordinary circumstances, *especially if you contact me ahead of time*. # Accommodations If you believe you need accommodations for a disability or a chronic medical condition, please email Student Disability Services at studentdisabilityservices@swarthmore.edu to arrange an appointment to discuss your needs. As appropriate, the office will issue students with documented disabilities or medical conditions a formal Accommodations Letter. Since accommodations require early planning and are not retroactive, please contact Student Disability Services as soon as possible. For details about the accommodations process, [visit the Student Disability Services website](http://www.swarthmore.edu/academic-advising-support/welcome-to-student-disability-service). You are also welcome to contact me privately to discuss your academic needs. However, all disability-related accommodations must be arranged, in advance, through Student Disability Services. # Schedule The topics below are subject to change. Please check this page regularly for updates. January 22, 2019: Intro; fundamentals Topics: * Introduction * Linear algebra review * Image formation * Image representations * Homogeneous coordinates Reading/resources: * Chapter 1 * Sections 2.1, 2.3 * [Linear algebra basics](http://www.swarthmore.edu/NatSci/mzucker1/linalg-reintroduction.pdf) * [Installing OpenCV](install_opencv.html) Assignments: * [Homework 1](homework1.pdf) * [Tutorial code](tutorial.zip) January 29, 2019: Points and lines Topics: * Lines in 2D * Review: ordinary least squares * Homographies & homogeneous least squares Reading/resources: * Sections 3.1, 3.6.1 * [Application: unprojecting text](https://mzucker.github.io/2016/10/11/unprojecting-text-with-ellipses.html) Assignments: * [Homework 2](homework2.pdf) * [Starter code](hw2_starter.zip) February 5, 2019: Background subtraction, filtering Topics: * Thresholding & color segmentation * Project 1 briefing * Morphological operators * Convolution & cross-correlation Reading/resources: * Sections 3.2, 3.3, 3.5.1, 3.5.2 Assignments: * [Project 1](project1.pdf) * [Starter code](project1.zip) * [Homework 3](homework3.pdf) February 12, 2019: Edge detection, frequency domain Topics: * Edge detection * Fourier transform Reading/resources: * Sections 4.2, 4.3 * [In-class filtering demo](filtering_demo.zip) * [In-class Fourier Transform demo](fourier.zip) Assignments: * [Homework 4](homework4.pdf) * [Starter code](convolution.zip) February 19, 2019: Laplacians, ML basics for Vision Topics: * Laplacians * Linear classifiers * Nearest neighbor * Intro to neural networks Reading/resources: * Section 3.5.3 Assignments: * [Project 2](project2.pdf) * [Homework 5](homework5.pdf) February 26, 2019: Learning methods Topics: * Neural networks, cont'd. * AdaBoost * Cascade classification * Viola-Jones object detection Reading/resources: * [Neural networks handout](neural-networks.pdf) * [Viola-Jones paper](../papers/violaJones_IJCV.pdf) * Section 14.1 Assignments: * [Homework 6](homework6.pdf) * [`xor_nnet.py`](xor_nnet.py) March 5, 2019: Viola-Jones, cont'd March 7, 2019: Midterm exam (in-class) (March 12, 2019): Spring break March 19, 2019: Dimensionality reduction Topics: * PCA * Eigenfaces * Clustering & $k$-means Reading/resources: * [`kmeans.zip`](kmeans.zip) * [`vbow.zip`](vbow.zip) * Section 14.2, 14.4 Assignments: * [Homework 7](homework7.pdf) * [Project 3](project3.pdf) * [`mnist_project.zip`](mnist_project.zip) March 26, 2019: Deep learning Topics: * Convolutional networks * Resnet Reading/resources: * [He et al 2015: Deep residual learning](he2015resnet.pdf) * [Keras implementation for CIFAR](https://github.com/keras-team/keras/blob/master/examples/cifar10_resnet.py) * [Pre-trained resnets in Keras](https://github.com/keras-team/keras/blob/master/docs/templates/applications.md#resnet) * [TensorFlow implementation (gross)](https://github.com/tensorflow/models/blob/master/official/resnet/resnet_model.py) Assignments: * [Homework 8](homework8.pdf) * [`hw8.zip`](hw8.zip) April 2, 2019: Deep learning, cont'd Topics: * Transfer learning * Feature visualization * Generative Adversarial Networks * Manifold learning Reading/resources: * [Feature visualization](https://distill.pub/2017/feature-visualization/) * [Image-to-image translation demo (e.g. edges2cats)](https://affinelayer.com/pixsrv/) * Papers (see citiations in links above, too): * [Goodfellow et al 2014: GANs](https://arxiv.org/abs/1406.2661) * [Isola et al 2017: Conditional Adversarial Nets](https://phillipi.github.io/pix2pix/) * [Schroff et al 2015: FaceNet](https://arxiv.org/abs/1503.03832) April 4, 2019: Algorithm/ML bias ***Special guest lecture by [Ameet Soni](https://www.cs.swarthmore.edu/~soni/) and [Krista Thomason](https://www.swarthmore.edu/profile/krista-thomason)*** Reading/Resources: * [Lecture slides (Soni & Thomason)](Algorithmic%20Bias%20in%20Computer%20Vision.pdf) * [How Big Data is Unfair](https://medium.com/@mrtz/how-big-data-is-unfair-9aa544d739de) * [Amazon needs to come clean about racial bias in its algorithms](https://www.theverge.com/2018/5/23/17384632/amazon-rekognition-facial-recognition-racial-bias-audit-data) * [*Optional: Big Data’s Disparate Impact by Solon Barocas & Andrew D. Selbst*](http://www.californialawreview.org/wp-content/uploads/2016/06/2Barocas-Selbst.pdf) Assignments: * [Final project](finalproject.html) April 9, 2019: 3D geometric fundamentals Topics: * Image formation redux * Camera calibration * Intrinsic & extrinsic parameters * Algebraic vs. geometric error Reading/resources: * Chapter 6 Assignments: * [Homework 9](homework9.pdf) * [`box3d.zip`](box3d.zip) April 16, 2019: Stereo & multiple view geometry Topics: * Stereo * Essential matrix Reading/resources: * Chapter 7 up to but not including 7.2.1 * Sections 11-11.3 * [`fundamental.zip`](fundamental.zip) Assignments: * [Homework 10](homework10.pdf) * [`stereo_hw.zip`](stereo_hw.zip) April 23, 2019: Keypoints; structured light Topics: * Keypoints * Feature detection * Feature matching * Structured light Reading/resources: * [FAST: Rosten et al. 2006](rosten2006fast.pdf) * [BRIEF: Calonder et al. 2010](calonder2010brief.pdf) * [ORB: Rublee et al. 2011](rublee2011orb.pdf) * [ROS - Kinect technical specs](http://wiki.ros.org/kinect_calibration/technical) April 30, 2019: Structure from motion Topics: * Singular value decomposition * Affine SFM Reading/resources: * Chapter 7 May 12, 2019: Final exam 9AM-12PM @ SCI 181