In your final project in the AI class, you are going to explore a computer vision task.
There are three options you can choose:
For the first option, you are asked to complete a video classification project related to the 2020 presidential election. It will be supervised by Stan. The second option is also a classification task which involves human pose estimation and motion tracking. It will be supervised by Yiwen. The third option is a computer vision topic decided by yourself.
The project will be graded on:
In this project, you are asked to design an AI system that can automatically analyze facial expressions. You will work with recent video clips showing 11 presidential candidates of 2020 in the United States. You are required to finish four stages of the project step by step:
We provide two sets of data. Both of them contain short video clips about presidential candidates for the 2020 US general election and three expression classes, namely "negative", "neutral", and "positive". The first set was collected in 2019 while the other one was collected in 2020. You can find the video clips and the labels in the directory "/projectnb2/cs640grp/PresidentialVideoData" on the SCC server. Note that both sets of data have an issue of class imbalance, you need to find a way to address this.
You will research deep models on facial expression analysis. You do not have to start from scratch. Instead, you will download existing models provided by the computer vision community and try them out. Use the data annotated in week 1 to help discover and choose the best performing model.
Further finetune your model to improve the performance. Search for the optimal set of hyperparameters for your model on the provided data. Report your results of stratified 5-fold cross validation.
More details on the report at the end of this document.
In this project, you are asked to design an AI system that can automatically analyze human motions. You will work with video recordings of people doing a set of exercises. Your binary classification task is to distinguish incorrect exercising from correct exercising. You are asked to work through three stages of the project step by step:
Being able to understand and modify models that are published by other research groups is an important skill for solving AI tasks. Many efforts have been devoted to developing deep models on human pose estimation and human motion analysis. You will research them and choose one of the existing models provided by the computer vision community. In particular, you want a model that takes RGB videos as input and outputs keypoints representing human joints. You will run the model to replicate the results reported by the developers, and use it to make predictions on the video dataset we provide. To evaluate the model performance, you can randomly choose one of our videos, annotate the keypoints using some online tools (more instructions later) and compare. You can try to tune the model to improve the performance. Search for the optimal set of hyperparameters for your model on the provided data.
You will use the results on the previous step to train a model to classify whether a given exercise performed by the subject in the recording is correct or not. To do that, you can either develop a separate model for classification or you can add classification layers upon the model you chose in the previous step.
More details on the report at the end of this document.
You select your own project. It needs to be computer-vision related. We suggest you complete the project following these steps:
You need to choose your project based on your available time (at least 8 hours per week) and skills. Feel free to talk to Professor Betke or TFs if you need help. A good topic should be specific and doable in one month. You will give a proposal for your project to TFs by the end of the first week. You may finish that during office hours or in a detailed, well-written email.
In week 2, you should start working on your project. Find the relevant papers or research works on similar computer vision problems. Try out available tools, code libraries, and datasets.
In week 3, you should start designing and implementing experiments based on the work of week 2. Your experiments should be thorough so that you will gain deep insights into the problem.
More details on report at the end of this document.
A big portion of your grade on this project comes from your paper-style report.
You can use any template (eg. AAAI, NIPS, etc.) to write your report. If you write a report on Overleaf, you can easily import a template provided by them. You can also find these templates simply by searching with a query like “NIPS template.”
When writing, you should follow a general paper structure, which means the following must be included: (1) An abstract and introduction explaining the idea and contribution of your project, (2) an explanation of the related works, (3) a detailed description of your method, (4) a description of your experiments and experiment results of your method in words, numbers, and figures, (5) a discussion of your results and observations, (6) conclusions and ideas for future work.
Your report does not have to be long, but should be complete.
Here is a breakdown of the grading criteria.
|Index||Weight||Option 1||Option 2||Option 3|
|1||10%||Your performance should beat the baseline (accuracy >= 60% for each class in both datasets).||The performance of your model. You earn extra credits for high PCK numbers.||The project idea is interesting and challenging.|
|2||10%||Your presentation is clear.|
You should submit your code and report by the deadline indicated on our course website. The instruction will be announced in a later date.
If you are using a third-party model, do not include it in the submission, as the file can be very large. Instead, cite the model in your report.