3D Reconstruction From Motion: CIS 581 Computer Vision

1. Problem Description:

We want to implement a 3D reconstruction from motion project, which takes the videos from two cameras with fixed relative position and orientation as input, and output the features with depth value, and finally reconstruct the 3D world. If possible, we also want to estimate the camera ego-motion and motion-field in the video from the stereo images we created.

2. Related Work:

a) Feature detection and description.
b) Depth recognition from two images.
c) 3D Reconstruction from features with depth.
d) Ego-motion estimation from stereo images.
e) Motion field estimation from stereo images.

3. Milestones:

The time for our final project is not as long as we expected, therefore we created 3 milestones(stages) for the whole project. Stage I and Stage II are what we must implement during this semester and Stage III is the extra stage which we might only implement partially. And we would definitely polish this project in the winter break.

Stage I: (Preparation Stage)

a) Background reading: because we are going to implement Hernan and Takeo’s paper “A Head-Wearable Short-Baseline Stereo System for the Simultaneous Estimation of Structure and Motion”[1] which describes a decent method on 3D reconstruction, camera movement estimation and motion field estimation from stereo images, we need to read carefully until we fully understand it.
b) Hardware preparation: For this project, we need at least two cameras with fixed relative position and orientation(Like human eyes). Thus we want to use two web-cams and duct-tape them to a plate. This might be the simplest way to get the hardware, but we still need to tune around to see whether we need to get the two cameras with same focal, same resolution and same exposure.
c) Feature descriptor and feature match: At this stage, we can simply use the corner features, run ANMS, extract the 40*40 neighbor, blur(gaussian or geometric) it, sub sample it to 8*8 and use this as the feature of a image. We can also use RANSAC model to match the features in two images. These are already done in our project 3. We will leave the feat_desc and feat_match functions as virtual, so we can modify them to any feature descriptors in the future.

Stage II: (Working Stage)

All the staff we are doing here is going around the feature correspondence.
a) Calculate the depth map from 2 images from the two “eye” cameras: This is the first step we should take in this project. We need to get the feature desciptors from the two images, find the correspondence, and get the depth map from binocular disparity.
b) Pure translational movement test: link all the images with depth map in a pure translational motion to a 3D scene.
c) Pure rotational movement test: link all the images with depth map in a pure rotational motion to a 3D scene.
d) Reconstruct the 3D scene from a sequence of images with depth map.

Stage III: (Challenging Stage)

a) Ego-motion estimation: because we don’t have any inertia detection devices, we need to estimate the camera movement from the vision input. (a1) We will try to find the position and orientation using the data in stage II (d); (a2) We can interpolate the camera transformation between the key-frames we got in (a1).
b) Motion-field estimation: (b1) directly apply optical flow algorithm to see whether we can get the motion-field while keep our camera still; (b2) detect the motion field with the camera movement, it needs to know the camera velocity and all the object velocities in 3D.
c) Acceleration: First step we want to accelerate this project is migrating the whole framework to c++ with OpenCV. The second step is trying to apply GPU accelerations to enable parallellized computing in some key steps.

4. Timeline:
Stage I should be finished within a week, which would be ended before Thur. Dec. 1st 2011.
Stage II would take the majority of time of us, it will lasts for at least 2 weeks, which will be mostly done before Fri. Dec. 16 2011.
For now, we are still doubt what we can deliver in the final deadline. So we divide our stages into a lot of sub-stages, our best wish is finishing all the work in Stage I and II and III (a) before the due date, and we would likely to polish our work in the winter break.

Reference:

[1] Badino, Hernan., Kanade, Takeo. (2011) A Head-Wearable Short-Baseline Stereo System for the Simultaneous Estimation of Structure and Motion, 12th IAPR Conference on Machine Vision Applications

3D Reconstruction From Motion

Monday, November 28, 2011

CIS 581 Computer Vision | Final Project Proposal