Then we can get the distance zs, which is the z-axis coordinate in the camera-based real-world coordinate system. And based on Z value, we can calculate the x-axis and y-axis coordinate with the following codes:
%calculate y and z values
xs = -(xis-cx) ./ fx .* zs;
ys = -(yis-cy) ./ fy .* zs;
The 'fx' and 'fy' are the focal length for x-direction and y-direction respectively. The 'cx' and 'cy' are the central spot for both axes based on the unit 'pixel'. The 'xis' and 'yis' are the pixel coordinates for the feature points. The 'xs', 'ys' and 'zs' are the world coordinate we get.
Thus, we get the 3D coordinates for those feature points. And for next step, we would like to use RANSAC to get the 3D homography translation.







