Resource
Zoom:
https://us02web.zoom.us/j/83872131099?pwd=dEovMlhna0ZSSlBHRkFaenJHTm9Qdz09
Meeting ID: 838 7213 1099
Passcode: 668523
It will be opened during class time. Please feel free to ask questions in the chat. I will answer in a direct message or give an explanation for all members.
Report Submission:
UY Moodle – HDU-UY-DD Visual Information Processing
All resources are provided to class participants only. Don’t distribute them to outside parties.
Password for all resources is cvdd.
Slot, day | Contents | |
01 | 4th, Feb 22 | Introduction Slides (PDF) TED ~ Fei-Fei Li (17:58) Settings – Google Colaboratory | Anaconda (PDF) Video – Introduction of OpenCV x Colab sift.py, home.jpg for setting verification Getting Started With Google Colab (Google-free) How to use Google Colaboratory (YouTube) Computer Vision – Instructional Exercise (Colab) |
02 | 1st, Feb 23 | Image preprocessing and feature extraction (Chapter 13) Slides 13 (PDF) Video – Chapter 13-1 (51:06) Homework: Select one paper, put the title as a comment for Assignment 3. |
03 | 3rd, Feb 28 | Video – Chapter 13-2 (67:08) The Pinhole camera (Chapter 14) Slides 14 (PDF) Video – Chapter 14 (60:04) |
04 | 4th, Feb 28 | Models for transformation (Chapter 15) Slides 15 (PDF) Video – Chapter 15 (48:36) |
05 | 4th, Mar 1 | Multiple cameras (Chapter 16) Slides 16 (PDF) Video – Chapter 16 (52:02) |
06 | 1st, Mar 2 | Assessment 1: Stereo Matching (PDF) Image set – img.zip (35.5MB) Video – Assignment1 (38:53, Partially overlapped with Chapter 16) |
07 | 3rd, Mar 7 | Models for shape (Chapter 17) Slides 17 (PDF) Video – Chapter 17 (54:32) |
08 | 4th, Mar 7 | Assignment 2: Deep monocular depth estimation (PDF) Video – Assignment2 (7:50) |
09 | 4th, Mar 8 | Assignment 3: Literature review (1) (PDF) Video – Assignment3 (3:51) |
10 | 1st, Mar 9 | Assignment 3: Literature review (2) & Assignment 2 |
11 | 3rd, Mar 14 | Assignment 3: Literature review (3) |
12 | 4th, Mar 14 | Assignment 3: Literature review (extra), Assignment 2 |
* 1st: 8:00-10:00, 2nd: 10:10-12:10, 3rd: 13:00-15:00, 4th: 15:10-17:10
(+1:00 in JST, 1st from 9:00-)
I want to assume you can use Google Colaboratory, as a background for python-opencv implementation, although I know well about your situation in China. If it is not available for some members, please use Anaconda or other environments. I hope another student will help the students.
If you can access Google tools, please follow the instructions for starting-up Google Colaboratory.
Literature review
You can input the whole title for searching the paper in Google Scholar.
(1)
yankang(颜康)
Learning Graph Embeddings for Compositional Zero-shot Learning
Yan Jinzhe(燕 劲哲)
Point Cloud Upsampling via Disentangled Refinement
Zhong Weizhen(钟 维真)
Closed-Form Factorization of Latent Semantics in GANs
Chen Zhoujie(陈 洲杰)
PointGuard: Provably Robust 3D Point Cloud Classification
Gong Aiyue(龚 嫒玥)
Composing Photos Like a Photographer
wangxiyuan(王希源)
Few-Shot Human Motion Transfer by Personalized Geometry and Texture Modeling
Tan Xuan
Encoding in Style: A StyleGAN Encoder for Image-to-Image Translation
(2)
Zhang Xinyuan
Track to Detect and Segment: An Online Multi-Object Tracker
Xi Wenlong(席 文龙)
Pose Recognition with Cascade Transformers
Wu JingTao(吴景涛)
End-to-End Video Instance Segmentation with Transformers
Zhang Xi
Transformer Interpretability Beyond Attention Visualization
Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers (ICCV2021)
Zhao Haifeng
Improving Sign Language Translation With Monolingual Data by Sign Back-Translation
Chen Zhe(陈 哲)
Depth from Camera Motion and Object Detection
Zhang Yixin
VirFace: Enhancing Face Recognition via Unlabeled Shallow Data
(3)
Zhou Haiqiang(周 海强)
CutPaste: Self-Supervised Learning for Anomaly Detection and Localization
Jiang Huasheng (蒋华胜)
Real-Time High-Resolution Background Matting
Zheng Haohao (郑 浩浩)
Distilling Knowledge via Knowledge Review
Bai Yizhuo(白 依卓)
D-NeRF: Neural Radiance Fields for Dynamic Scenes
Zhang Boyang
Body Meshes as Point
Su Xianhua(苏 先华)
UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
Q&A
You can ask any questions on Slack #vip.
If you have trouble in implementation, ask Teaching Assistant (Prof. Yu Zhengsheng; yuzhengsheng@hdu.edu.cn) at first.
Syllabus
This course offers opportunity to learn 2D and 3D computer vision and computer graphics. We do so by combining such fundamentals as image processing, computational geometry, machine learning, numerical computation, linear algebra, and others.
Methods to realize tracking, recognition, and other functions on 2D images will be discussed at first. To be more specific, feature detection and image descriptor for images, machine learning algorithms for recognition or classification of images are discussed. OpenCV, the de facto standard library of computer vision, will support the students to understand and prototype the methods.
Camera calibration for 3D depth estimation is the main topic of the next part. Special camera models and human vision model are also our concern.
In the last part, the student understands and explains several state-of-the-art algorithms for computer vision and computer graphics. The student is able to research newly available computer vision and computer graphics on his/her own to implement and benefit from the methods.
The student must have command of linear algebra and calculus, skills in programming (e.g., by using python, C++, or MATLAB), as well as understanding of important algorithms and data structures. One should also know basics of image processing techniques (e.g., image filtering) and machine learning (e.g., clustering, classifiers such as Support Vector Machines, dimensionality reduction, or regression).
Grading Policy
Technical report that involves programming (75%+5%)
Stereo matching (Assignment 1)
Deep monocular depth estimation (Assignment 2)
Literature review and presentation (15%+5%) (Assignment 3)
* No classwork (10%) for online class
Textbooks
Computer Vision: Models, Learning and Inference
Simon J.D. Prince – Chapters 13-20
http://www.computervisionmodels.com/
https://www.amazon.cn/dp/B073FPHJ99/
OpenCV-Python Tutorials
https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_tutorials.html
Reference:
OpenCV 3.x with Python By Example
Gabriel Garrido and Prateek Joshi
https://www.packtpub.com/application-development/opencv-3x-python-example-second-edition
https://github.com/PacktPublishing/OpenCV-3-x-with-Python-By-Example
Learning OpenCV 3
Adrian Kaehler and Gary Bradski
https://www.oreilly.com/library/view/learning-opencv-3/9781491937983/
Contact
Slack #vip in HDU-UY-DD-2022
Masahiro Toyoura (mtoyoura@yamanashi.ac.jp)