BU Logo
CAS CS 585 Image and Video Computing - Spring 2024

CAS CS 585 Image and Video Computing - Spring 2024

Course Info - Course Objectives - Course Materials - Requirements - Collaboration and Academic Integrity
Help - Course Schedule - Labs - Assignments - Computer Vision Links


Lectures: Tuesday, Thursday 5 pm - 6:15 pm in CAS 224
Labs: Friday 1:25-2:15 (CAS 222) 2:30-3:20 (CAS 228), 3:35-4:25 (CAS 228) and 4:40-5:30 (CAS 225)
Instructor: Prof. Margrit Betke
Head Teaching Fellow: Hao Yu
Additional Teaching Fellows: Wenda Qin and Kaihong Wang
Class web page:
http://www.cs.bu.edu/faculty/betke/cs585
Piazza page: https://piazza.com/bu/spring2024/cs585
Gradescope entry code:
DPRY54
Contact Information:

Staff Best Way to Contact Office Hours
Margrit Betke betke@bu.edu Mon 4-6 pm, https://bostonu.zoom.us/my/margritbetke
Tue, Thu 6:15 in CAS 224
Teaching Fellows Piazza. But note there is a 24-hour timer on your Piazza questions to encourage other students to answer questions before the TFs will able to see your question.
Only send an email to a TF if your question is very urgent and requires an answer earlier than 24 hours.
Wed 1-3 pm, CDS 701

Seeing Me on Zoom:
Please feel free to stop by my zoom office hours. I'm happy to talk with you about the course, research in AI, computer vision, machine learning, your plans for the future, or anything else. Check out my personal web page (in need of updating...) or my reading list to get to know me a little.

Responsibilities of the Teaching Fellows:
The TFs are responsible for teaching the four laboratory sections and helping you out during their office hours. They will also help me design the written homework and programming projects, and they will manage the graders. Please contact the TFs first if you have questions about your homework grades.


Course Objectives

Our goal is to build computer systems that analyze images automatically and determine what the computer "sees" or "recognizes." The course gives you a fundamental introduction to computer vision methods. Applications include human-computer interfaces, face detection, medical image analysis, infrared image analysis of animals, vision systems for intelligent vehicles, and deep learning models.

Prerequisites: Strong background in linear algebra or geometric algorithms, and calculus. Programming experience (e.g., Python, C++ or Java).


Course Materials

The syllabus below will be updated throughout the semester with lecture notes and reading materials. Links to homework assignments will come alive when the assignment is ready for you to work on. The first assignment will give instructions on how you must submit your solution using Gradescope.

Handouts: The updated course syllabus and most handouts are made available online. Check our course web page at least once a week for homework assignments and other information.

Textbook: I recommend some chapters in Computer Vision. Algorithms and Applications. Second Edition by Richard Szeliski (free electronic download) and Robot Vision by BKP Horn, MIT Press. The books are not required. I will propose alternative reading material.


Requirements

Class Engagement: Come to class and participate regularly. Reading the assigned texts and listening in lectures and labs will only give you a "passive understanding" of the material. We encourage discussions during lectures and labs to help you acquire an "active understanding" of the material so that you can evaluate existing computer systems critically and learn to develop your own creative solutions. Another way of interacting with the instructors and your classmates is answer students' questions on Piazza.

Homework: Guidelines for submission will be provided with each assignment. We allow teams of two students for the programming assignments. Each student must prepare their own written homework. Late solutions will be levied a late penalty of 20% per day (up to three days). After three days, no credit will be given.

Course Team Project: The project involves developing a computer vision system in a team. More details will be announced shortly. Please read the instructions for the course project carefully. They will be published on piazza.

Colloquia: You are encouraged to attend the weekly seminar series, Wednesdays, 11-12, of the BU AI Research Initiative. You may also check out AI talks in other departments and at other local universities. A partial list includes: BU College of Engineering Seminar Calendar, CSAIL lab at MIT, Northeastern College of Computer and Information Science, Tufts Department of Computer Science Colloquium.

Talk Reports: You must attend at least three talks on subjects related to Computer Vision and write a summary on each talk. One of the talks must be among the two AIR Seminar talks:

Your writeup should give a (1) problem definition, (2) summarize the methods and results, (3) discuss the work critically, and also (4) briefly explain how the work relates to material discussed in class. One page is sufficient, do not submit more than two pages. You may not work in a team but produce your own independent writeup. Check your text for typographical and grammatical errors, especially if English is not your native language. Use tools to check for error, e.g., grammarly.com, ChatGPT. You must acknowledge use of these tools. You will lose points if you simply copy the speaker's abstract, your review is late, contains typographical and grammatical errors, does not contain a discussion, and does not contain a statement how the talk relates to class material.

Exams: There will be two exams on the material discussed in the class and practiced with homeworks. The exams will be quite easy for students who attend class, participate in our discussions, and keep up with homework assignments and programming projects. The midterm exam will be on February 29, 2024, final exam will be during the finals period in May and will be determined by the University.

Grading Policy: Your final grade will be determined as follows:

Midterm quiz: 15%
Final exam 30%
5 Homework Sets 30%
Class Participation/Engagement (Lecture, Lab, Piazza) 10%
Team Programming Project and its Presentation 10%
3 AI Talk Reports 5%


Collaboration and Academic Integrity

You are encouraged to collaborate on the solution of the homework. If you do, you must acknowledge your collaborators. Each student must submit his or her own electronic version of the solutions, except for the final project. If you use algorithms or code that are not your own original work and that were not provided in class or discussed in the textbook, you must give a detailed acknowledgment of your source.

You are not allowed to collaborate on the solution of the exams. Sources must be acknowledged.

Cheating and plagiarism are not worthy of Boston University students. I expect you to abide by the rule stated above and the standards of academic honesty and computer ethics policy described in http://www.bu.edu/computing/ethics/ and http://www.bu.edu/academics/policies/academic-conduct-code


Help

Image and Video Computing is an elective course that will introduce you to an exciting topic in computer science. It should be fun and not too much of a struggle for you. Make sure that you have had the prerequisites. Depending on your level of programming experience and/or mathematics background, the course may be challenging for you. If you do not understand the material, ask for help immediately. Ask questions in class. If one student is confused about something, then maybe others are also confused and grateful that someone asked. Come and see me or the TFs for help or send us email. Our task is to help you learn a very interesting topic!

Terms of Use of Course Materials

All materials provided on this website are freely available for not-for-profit educational use. We request that (i) attribution is retained as academically appropriate, and that (ii) you send me a short email (betke@bu.edu) to let me know that you are using some of the materials.

Course Schedule

Dates Topics --- Links will become active with lecture Readings --- Links will become active Assignments
Thu 1/18/2024 Course Introduction: Why study IVC? Industry successes and current needs. Research in computer vision. Camera Mouse. Image Formats. Wiki Intro or Horn Ch. 1 or Szeliski Ch. 1.
Camera Mouse 2018. New Camera Mouse 2024.
1/19/24: A0 out (after lab, simple I/O programming)
Tu 1/23, Th 1/25 Binary Image Analysis: Moments, centroid, object orientation, circularity measures (blackboard, no slides). Image Projections, Object Localization, Flood Fill Algorithm, Sequential Multi-object Labeling Algorithm. Skin-color based face detection algorithm. Lecture notes. Image Moments, Binary Image Analysis. Horn Ch. 3. Wed 1/24: A0 due
Mo 1/22: A1 out (paper-pencil proofs, no programming)
Tu 1/30, Th 2/1 Programming with images: Pitfalls. Template matching background differencing. Skin-based face detection. Similarity Functions (SSD, NCC), Motion: Template-based Tracking. Image Pyramids. Motion energy. Neighborhoods. Segmentation algorithms. Border following algorithm. Hausdorff Distance. Curvature of Boundaries. Confusion matrix analysis. Tumor Detection in Computed Tomography Images. Lecture notes 1 Lecture notes 2, Lecture notes 3 Lecture notes 4. Lecture notes 5. Video 1. Wiki on template matching, normalized correlation. Image Moments, Binary Image Analysis. Horn Ch. 3. Border following algorithm. Segmentation (many additional algorithms), Hausdorff distance. Thresholding. Freeman et al., Computer Vision for Interactive Computer Graphics. Fawcett (ROC). Wed 1/31: A1 due. Fr 2/2: A2 out (programming assignment on hand interaction)
Tu 2/6 Image Formation: Pinhole Model, Binocular Stereo Pinhole model, pinhole camera,
Th 2/8 Image Formation: Vanishing Points, Lens Equation. Computer vision and analysis of paintings. Vanishing_point. Thin lens equation for image formation.
Th 2/15, Tu 2/20 Tu 2/13: Snow day, no lecture.
Tracking Methods and Applications: Tracking with Alpha-beta Filter, Kalman Filter, Tracking Groups of Animals, Multiple Object Tracking and Data Association. Kalman Filter lecture notes
Alpha beta filter, Kalman filter. Hungarian algorithm. Wed 2/14: A2 due. A3 out (programming assignment on tracking)
Th 2/22/24 Last day to drop course without a W grade, February 22.
Edge Masks, Canny Edge Detection. Active Contours.
Sobel, Prewitt, Roberts, Mexican Hat, Difference of Gaussians, Canny Edge Detector. Williams and Shah, 1992: paper and figures.  
Tue 2/27/24 Optical Flow Optical Flow, Horn 81 Lucas-Kanade 81 Tuesday 2/27: A3 due (use Wed to study for the exam)
Th 2/29/24 In-class Midterm Exam: NOT in regular classroom but in LAW AUD. You may bring a 1-page crib sheet (handwritten, not printed or xeroxed)  
Tu 3/5 Deep learning and computer vision. ImageNet website. Start your project. Project Ideas
Th 3/7 Face Recognition. ImageNet website. LFW website. MegaFace. FaceScrub. IARPA Janus Benchmark A, CFP Dataset, AFLW 2000 Dataset. Cao et al. 2018, Zhu et al. 2019 Zhu et al. 2020, Zhu et al. 2022 Best-Rowden, Jain 2017 Banerjee et al., 2013. IAB IIT Jodhpur West Virginia biometrics Start your project. Project Ideas
3/9/24 - 3/17/24 Spring Break  
Tu 3/19 Face Recognition under pose variation and aging. Facial Expression Recognition. Biometrics. See papers linked in slides. Facial Expressions, Black/Yacoob 97, Yacoob1 avi, Yacoob2 avi Wed 3/20: A4 out (programming assignment on computer vision and deep learning)
Th 3/21, Tu 3/26, Th 3/28 Binocular Stereo, Multiview Stereo, Epipolar Geometry, Active Stereo with Structured Light, Structure-from-Motion Problem.
Deep Learning and Segmentation.
Last day to drop course with a W grade, March 29.
See papers linked in slides.  
Tu 4/2 Generative Models: GANS and Diffusion Models. Hao's slides, extra slides. See papers linked in slides. Wed 4/3: A4 due (extended until 4/5)
Thu 4/4 Surface reflectance, BRDF, Lambertian model, reflectance maps, photometric stereo algorithm, shape-from-shading algorithm See papers linked in slides. Wiki on Lambert's law, Lambertian reflectance .
Tu 4/9 Supervised, Semisupervised, and Unsupervised Learning in Computer Vision. Wed 4/10: A5 out (paper-pencil only, link on Piazza)
Th 4/11 Image Registration. Medical Image Analysis. Lung Image Analysis of Cancer and COVID Patients. Absolute Orientation in 2D and 3D, Quaternions, Iterative Closest Point Algorithm. Absolute Orientation, Horn 89 See papers linked in slides.  
Tu 4/16 Transformers. See papers linked in slides. Wed 4/17: A5 due
Th 4/18 Vision-and-language Navigation    
Tu 4/23 Multiobject Multiview Tracking, Computer Vision and Ethics
Review for Final Exam (Thin-Lens Model, Optical Flow, all topics after midterm exam)
See papers linked in slides.  
Th 4/25/24, Tu 4/30/24 Student Projects. See Schedule on Piazza.   Thursday 4/25: Project Deadline (Code, Slides, Report)
Monday, May 6, 6-8 pm Final Exam    

Assignments

The assignments will have either programming tasks or paper-and-pencil exercises, or both. The links will became active when the assignment is announced.

Computer Vision Links

Check out http://www.cs.bu.edu/faculty/betke/links.html if you need additional ideas for your class project, if you are looking for a job in computer vision (list of companies), or if you are interested in computer vision research. You will find a list of links to computer vision conferences, journals, research groups, and companies.

Calculus Background

I do not expect you to have a background multivariate calculus. I will introduce the tools we will need. You may find the first few chapters of these notes by Cain and Herod useful, in particular, partial derivatives, Taylor polynomial, Multivariate Taylor polynomial.

Margrit Betke, Professor
Computer Science Department
Boston University
URL: http://www.cs.bu.edu/faculty/betke

Last updated: April 24, 2024