Meeting ID: 838 7213 1099
It will be opened during class time. Please feel free to ask questions in the chat. I will answer in a direct message or give an explanation for all members.
All resources are provided to class participants only. Don’t distribute them to outside parties.
Password for all resources is cvdd2021 .
|01||3rd, Mar 24||
TED ~ Fei-Fei Li (17:58)
Settings – Google Colaboratory | Anaconda
Video – Introduction of OpenCV x Colab
home.jpg for setting verification
Getting Started With Google Colab (Google-free)
How to use Google Colaboratory (YouTube)
Computer Vision – Instructional Exercise (Colab)
|02||4th, Mar 24||
Image preprocessing and feature extraction (Chapter 13)
Video – Chapter 13-1 (51:06)
Homework: Select one paper, put the title as a comment for Assignment 3.
|03||1st, Mar 31||
Information for Assessment 1
Video – Assignment1_Intro (2:09)
Video – Chapter 13-2 (67:08)
The Pinhole camera (Chapter 14)
Video – Chapter 14 (60:04)
|04||2nd, Mar 31||
Models for transformation (Chapter 15)
Video – Chapter 15 (48:36)
|05||3rd, Apr 7||
Multiple cameras (Chapter 16)
Video – Chapter 16 (52:02)
|06||4th, Apr 7||
Assessment 1: Stereo Matching (Due: Apr 18)
Image set – img.zip (35.5MB)
Video – Assignment1 (38:53, Partially overlapped with Chapter 16)
|07||3rd, Apr 14||
Models for shape (Chapter 17)
Video – Chapter 17 (54:32)
|08||4th, Apr 14||
Augmented reality (OpenCV 3.x Chapter 10)
Video – Assignment2-1 (33:10, Based on OpenCV_3.x_10)
Link – XML files for face parts detection
Video – Assignment2-2 (14:22)
|09||3rd, Apr 21||
Assessment 3: Literature review (1) (Due: Apr 21/28+29)
Video – Assignment3 (2:09)
|10||4th, Apr 21||Assessment 3: Literature review (2) & Assessment 2 (Due: May 1)|
|11||3rd, Apr 28||Assessment 3: Literature review (3)|
|12||4th, Apr 28||Assessment 3: Literature review (extra), Assessment 2|
* 1st: 8:00-10:00, 2nd: 10:10-12:10, 3rd: 13:00-15:00, 4th: 15:10-17:10
(+1:00 in JST, 1st from 9:00-)
I want to assume you can use Google Colaboratory, as a background for python-opencv implementation, although I know well about your situation in China. If it is not available for some members, please use Anaconda or other environments. I hope another student will help the students.
If you can access Google tools, please follow the instructions for starting-up Google Colaboratory.
You can input the whole title for searching the paper in Google Scholar.
金宸极 (Chenji Jin)
SegGCN: Efficient 3D Point Cloud Segmentation With Fuzzy Spherical Kernel
Context-Aware Group Captioning via Self-Attention and Contrastive Features
Supervised Raw Video Denoising with a Benchmark Dataset on Dynamic Scenes
MaskGAN: Towards Diverse and Interactive Facial Image Manipulation
On Positive-Unlabeled Classification in GAN
赵江江 (Jiangjiang Zhao)
Syn2Real Transfer Learning for Image Deraining using Gaussian Processes
Transformer Interpretability Beyond Attention Visualization (CVPR2020 paper??)
PointAugment: an Auto-Augmentation Frameworkfor Point Cloud Classification
Dynamic Traffic Modeling From Overhead Imagery
VIBE: Video Inference for Human Body Pose and Shape Estimation
Lightweight Multi-View 3D Pose Estimation Through Camera-Disentangled Representation
徐然 (Ran Xu)
G-TAD: Sub-Graph Localization for Temporal Action Detection
Referring Image Segmentation via Cross-Modal Progressive Comprehension
夏念璋 (Nianzhang Xia)
Triple-GAN: Progressive Face Aging with Triple Translation Loss (CVPR Workshop?)
Bringing Old Photos Back to Life
Learning to Autofocus
Interpretable and Accurate Fine-grained Recognition via Region Grouping
Visual-Textual Capsule Routing for Text-Based Video Segmentation
毛锦涛 (Jintao Mao)
Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation (CVPR2020 paper?? >> Accepted in CVPR2021)
You can ask any questions on Slack #vip.
If you have trouble in implementation, ask Teaching Assistant (Mr. Yu Zhengsheng; email@example.com) at first.
Q. What’s wrong for this disparity result?
A. I can see the outline of rabbits. Is the range of values correct? The values look binary (0 or 255). For example, a float64 image could be displayed as an int32 image.
cf) How to use `cv2.imshow` correctly for the float image returned by `cv2.distanceTransform`?
This course offers opportunity to learn 2D and 3D computer vision and computer graphics. We do so by combining such fundamentals as image processing, computational geometry, machine learning, numerical computation, linear algebra, and others.
Methods to realize tracking, recognition, and other functions on 2D images will be discussed at first. To be more specific, feature detection and image descriptor for images, machine learning algorithms for recognition or classification of images are discussed. OpenCV, the de facto standard library of computer vision, will support the students to understand and prototype the methods.
Camera calibration for 3D depth estimation is the main topic of the next part. Special camera models and human vision model are also our concern.
In the last part, the student understands and explains several state-of-the-art algorithms for computer vision and computer graphics. The student is able to research newly available computer vision and computer graphics on his/her own to implement and benefit from the methods.
The student must have command of linear algebra and calculus, skills in programming (e.g., by using python, C++, or MATLAB), as well as understanding of important algorithms and data structures. One should also know basics of image processing techniques (e.g., image filtering) and machine learning (e.g., clustering, classifiers such as Support Vector Machines, dimensionality reduction, or regression).
Technical report that involves programming (75%+5%)
Stereo matching (Assignment 1)
Video face tracking (Assignment 2)
Literature review and presentation (15%+5%) (Assignment 3)
* No classwork (10%) for online class
OpenCV 3.x with Python By Example
Gabriel Garrido and Prateek Joshi
Learning OpenCV 3
Adrian Kaehler and Gary Bradski
Slack #vip in HDU-UY-DD-2021
Masahiro Toyoura (firstname.lastname@example.org)