Meeting ID: 838 7213 1099
Passcode: 668523
It will be opened during class time. Please feel free to ask questions in the chat. I will answer in a direct message or give an explanation for all members.

Google Classroom
Classcode: k2m4zzk

All resources are provided to class participants only. Don’t distribute them to outside parties.
Password for all resources is cvdd2021 .

Slot, day Contents
01 3rd, Mar 24 Introduction
TED ~ Fei-Fei Li (17:58)
Settings – Google Colaboratory | Anaconda
  Video – Introduction of OpenCV x Colab
  home.jpg for setting verification
Getting Started With Google Colab (Google-free)
  How to use Google Colaboratory (YouTube)
  Computer Vision – Instructional Exercise (Colab)
02 4th, Mar 24 Image preprocessing and feature extraction (Chapter 13)
Slides 13
Video – Chapter 13-1 (51:06)
Homework: Select one paper, put the title as a comment for Assignment 3.
03 1st, Mar 31 Information for Assessment 1
Video – Assignment1_Intro (2:09)
Video – Chapter 13-2 (67:08)
The Pinhole camera (Chapter 14)
Slides 14
Video – Chapter 14 (60:04)
04 2nd, Mar 31 Models for transformation (Chapter 15)
Slides 15
Video – Chapter 15 (48:36)
05 3rd, Apr 7 Multiple cameras (Chapter 16)
Slides 16
Video – Chapter 16 (52:02)
06 4th, Apr 7 Assessment 1: Stereo Matching (Due: Apr 18)
Image set – (35.5MB)
Video – Assignment1 (38:53, Partially overlapped with Chapter 16)
07 3rd, Apr 14 Models for shape (Chapter 17)
Slides 17
Video – Chapter 17 (54:32)
08 4th, Apr 14 Augmented reality (OpenCV 3.x Chapter 10)
Textbook OpenCV_3.x_10+4
Video – Assignment2-1 (33:10, Based on OpenCV_3.x_10)
Link – XML files for face parts detection
Video – Assignment2-2 (14:22)
09 3rd, Apr 21 Assessment 3: Literature review (1) (Due: Apr 21/28+29)
Video – Assignment3 (2:09)
10 4th, Apr 21 Assessment 3: Literature review (2) & Assessment 2 (Due: May 1)
11 3rd, Apr 28 Assessment 3: Literature review (3)
12 4th, Apr 28 Assessment 3: Literature review (extra), Assessment 2

* 1st: 8:00-10:00, 2nd: 10:10-12:10, 3rd: 13:00-15:00, 4th: 15:10-17:10
(+1:00 in JST, 1st from 9:00-)

I want to assume you can use Google Colaboratory, as a background for python-opencv implementation, although I know well about your situation in China. If it is not available for some members, please use Anaconda or other environments. I hope another student will help the students.
If you can access Google tools, please follow the instructions for starting-up Google Colaboratory.

Literature review

You can input the whole title for searching the paper in Google Scholar.

金宸极 (Chenji Jin)
SegGCN: Efficient 3D Point Cloud Segmentation With Fuzzy Spherical Kernel
xin wu
Context-Aware Group Captioning via Self-Attention and Contrastive Features
Liu Weijian
Supervised Raw Video Denoising with a Benchmark Dataset on Dynamic Scenes
wenbin wu
MaskGAN: Towards Diverse and Interactive Facial Image Manipulation
Yuhan Wang
On Positive-Unlabeled Classification in GAN
赵江江 (Jiangjiang Zhao)
Syn2Real Transfer Learning for Image Deraining using Gaussian Processes

Transformer Interpretability Beyond Attention Visualization (CVPR2020 paper??)
PointAugment: an Auto-Augmentation Frameworkfor Point Cloud Classification
Zhenfei Wang
Dynamic Traffic Modeling From Overhead Imagery
Qingqi Huang
VIBE: Video Inference for Human Body Pose and Shape Estimation
Wang Hanxiang
Lightweight Multi-View 3D Pose Estimation Through Camera-Disentangled Representation
徐然 (Ran Xu)
G-TAD: Sub-Graph Localization for Temporal Action Detection
dingjie wu
Referring Image Segmentation via Cross-Modal Progressive Comprehension

夏念璋 (Nianzhang Xia)
Triple-GAN: Progressive Face Aging with Triple Translation Loss (CVPR Workshop?)
mei chen
Bringing Old Photos Back to Life
shaocheng xiang
Learning to Autofocus
Zhang Riheng
Interpretable and Accurate Fine-grained Recognition via Region Grouping
Dongjun Liu
Visual-Textual Capsule Routing for Text-Based Video Segmentation
毛锦涛 (Jintao Mao)
Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation (CVPR2020 paper?? >> Accepted in CVPR2021)


You can ask any questions on Slack #vip.
If you have trouble in implementation, ask Teaching Assistant (Mr. Yu Zhengsheng; at first.

Assignment 1

Q. What’s wrong for this disparity result?
A. I can see the outline of rabbits. Is the range of values correct? The values look binary (0 or 255). For example, a float64 image could be displayed as an int32 image.
cf) How to use `cv2.imshow` correctly for the float image returned by `cv2.distanceTransform`?


This course offers opportunity to learn 2D and 3D computer vision and computer graphics. We do so by combining such fundamentals as image processing, computational geometry, machine learning, numerical computation, linear algebra, and others.

Methods to realize tracking, recognition, and other functions on 2D images will be discussed at first. To be more specific, feature detection and image descriptor for images, machine learning algorithms for recognition or classification of images are discussed. OpenCV, the de facto standard library of computer vision, will support the students to understand and prototype the methods.

Camera calibration for 3D depth estimation is the main topic of the next part. Special camera models and human vision model are also our concern.

In the last part, the student understands and explains several state-of-the-art algorithms for computer vision and computer graphics. The student is able to research newly available computer vision and computer graphics on his/her own to implement and benefit from the methods.

The student must have command of linear algebra and calculus, skills in programming (e.g., by using python, C++, or MATLAB), as well as understanding of important algorithms and data structures. One should also know basics of image processing techniques (e.g., image filtering) and machine learning (e.g., clustering, classifiers such as Support Vector Machines, dimensionality reduction, or regression).

Grading Policy

Technical report that involves programming (75%+5%)
    Stereo matching (Assignment 1)
    Video face tracking (Assignment 2)

Literature review and presentation (15%+5%) (Assignment 3)

* No classwork (10%) for online class


Computer Vision: Models, Learning and Inference
Simon J.D. Prince – Chapters 13-20

OpenCV 3.x with Python By Example
Gabriel Garrido and Prateek Joshi

OpenCV-Python Tutorials

Learning OpenCV 3
Adrian Kaehler and Gary Bradski


Slack #vip in HDU-UY-DD-2021
Masahiro Toyoura (