CAS CS 585 - Spring 2024

Assignment 2

The assignment is due 11:59 PM (at midnight) EST, Wednesday, February 14, 2024. You are allowed to work in teams of two students for this programming assignment.

In this assignment, you will build upon the ideas you learned in class and in the labs. Your team will design and implement algorithms that recognize hand shapes or gestures, and create a graphical display that responds to the recognition of the hand shapes or gestures.

Learning Objectives
  1. Read and display video frames from a webcam
  2. Learn about tracking by template matching
  3. Learn about analyzing properties of objects in an image, e.g. object centroid, axis of least inertia, shape (circularity)
  4. Create interesting and interactive graphical applications

Requirements

Design and implement algorithms that recognize hand shapes (such as making a fist, thumbs up, thumbs down, pointing with an index finger etc.) or gestures (such as waving with one or both hands, swinging, drawing something in the air etc.) and create a graphical display that responds to the recognition of the hand shapes or gestures. For your system, you are encouraged to try out some of the following computer vision techniques that were discussed in class and use at least a couple of techniques (in particular, binary object shape analysis):

  1. horizontal and vertical projections to find bounding boxes of ”movement blobs” or ”skin-color blobs”
  2. size, position, and orientation of object of interest
  3. circularity of object of interest
  4. template matching (e.g., create templates of a closed hand and an open hand)
  5. background differencing: D(x,y,t) = |I(x,y,t)-I(x,y,0)|
  6. frame-to-frame differencing: D’(x,y,t) = |I(x,y,t)-I(x,y,t-1)|
  7. motion energy templates (union of binary difference images over a window of time)
  8. skin-color detection (e.g., thresholding red and green pixel values)
  9. tracking the position and orientation of moving objects

You may use OpenCV library functions in your solution. If you do so, you must understand the OpenCV function in detail -- both the mathematical formulation and the algorithm. In particular, you must be able to explain the function on a whiteboard without access to the OpenCV help pages.

Your algorithm should detect at least four different hand shapes or gestures.

For your assignment, you are required to submit a detailed report. Please refer to the provided template for guidance on the content that should be included. In your report, create a confusion matrix to illustrate how well your system can classify the hand shapes or gestures. You are also asked to create a graphical display that responds to the movements of the recognized gestures. The graphics should be tasteful and appropriate to the gestural movements. Along with the program, submit the following information about your graphics program:

  1. An overall description
  2. How the graphics respond to different hand shapes and/or gestures
  3. Interesting and fun aspects of the graphics display
Submission

The programming assignment (code), along with the report (pdf) and results (a real-time demo video), should be submitted to Gradescope under "A2". Please submit each file seperately (do not ZIP or archive them). Here are some guidelines to follow when submitting:

Margrit Betke, Professor
Computer Science Department
Email: betke@cs.bu.edu