CAS CS 585 Image and Video Computing - Spring 2021

Assignment 5

The written assignment is due midnight (11:59 PM) Friday, April 2nd. Please scan your written assignment as a single PDF file and use GradeScope to submit. The programming assignment must be submitted electronically via Gradescope on the same due date. We recommend that you team up in groups of 2-3 students (you don't have to work in a team but it may make your project much more fun, interesting, and rewarding to work with others).

Written Assignment:

Exercise 1: Single-Object Tracking

(a) What is meant by the term "tracking by detection" in computer vision?

(b) Briefly describe the difference between such a tracker and the alpha-beta tracker.

Exercise 2: Kalman Filter

  Assuming a state vector containing position and speed, and measurements of only position, write down a state evolution function / matrix and measurement model matrix for a constant velocity model.

Exercise 3: Kalman Filter

  Write down how you would change the state vector , state evolution matrix, and measurement model matrix, from Exercise 2, for a constant position model (where you believe that the value is not changing but is only corrupted by noise).

Exercise 4: Multi-Object Tracking: Data Association

Briefly describe an advantage of using GNNSF over MHT.


Programming Assignment: Multiple Object Tracking

The goal of this part of the programming assignment is for you to learn more about the practical issues that arise when designing a tracking system. You are asked to track moving objects in video sequences, i.e., identifying the same object from frame to frame:

If you decide to use the advanced options (MHT, optimal data association, Kalman filter), you are encouraged to use a 3rd party library.

We provided two datasets for you.

1) The bat dataset shows bats in flight, where the bats appear bright against a dark sky. We included both grayscale and false-color images from this thermal image sequence; you may use whichever images you prefer.
bat flying scene

2) The cell dataset shows mouse muscle stem cells moving in hydrogel microwells, the brightness of the pixels within the cells are very similar to the values of the background.
microscopy image with mouse muscle stem cells

Just in case you had any trouble segmenting images and distinguishing objects (multi-object labeling), we are providing segmentation and/or detections for the bat dataset. The segmentation of the bat dataset is provided in a set of label maps. There is one number per pixel, delimited by commas. Pixels with the value 0 are background. The maps are 1024 by 1024. The detections are given in a comma delimited file, one for each frame. There is one point per line. Each point is given as the X coordinate followed by the Y coordinate, delimited by commas.

For the cell dataset, no segmentation is provided. It's your task to do both segmentation and tracking. Note: 1) The filopodia, which cells have during migration, are long "feet" that are difficult to outline automatically. 2) Some cells spit into daughter cells. Since accurate cell segmentation is very challenging, to obtain full credit, your focus should be on the multi-object tracking task (and detecting the birth of a new cell), while your segmentation result can be relatively coarse.

You do not need to use our segmentation/detection if you would like to see the results using your own algorithms. You may use any of your code from the previous assignments and any library functions you wish.

Display the results of your tracking algorithm on top of the original images. Use different colors to show that you successfully maintain track identity. Draw lines to show the history of the flight trajectories.

In your write-up, you should discuss the following items:

  1. Show your tracking results on some portion of the sequence. In addition to showing your tracking results on an easy portion of the data, identify a challenging situation where your tracker succeeds, and a challenging situation where your tracker fails.
  2. How do you decide to begin new tracks and terminate old tracks as the objects enter and leave the field of view?
  3. What happens with your algorithm when objects touch and occlude each other, and how could you handle this so you do not break track?
  4. What happens when there are spurious detections that do not connect with other measurements in subsequent frames?
  5. What are the advantages and drawbacks of different kinematic models: Do you need to model the velocity of the objects, or is it sufficient to just consider the distances between the objects in subsequent frames?

You will demonstrate your code to one of the graders according to the demo schedule. Be prepared to show your code working and discuss any important issues that you discovered and how you addressed them.

Margrit Betke, Professor
Computer Science Department
Email: betke@cs.bu.edu