Resolved: Structured-light 3D scanner – depth map from pixel correspondence

Question:

I try to create Structured-light 3D scanner.

Camera calibration


Camera calibration is copy of OpenCV official tutorial. As resutlt I have camera intrinsic parameters(camera matrix).

Projector calibration


Projector calibration maybe is not correct but process was: Projector show chessboard pattern and camera take some photos from different angles. Images are cv.undistored with camera parameters and then result images are used for calibration with OpenCV official tutorial. As result I have projector intrinsic parameters(projector matrix).

Rotation and Transition


From cv.calibrate I have rotarion and transition vectors as results but vectors count are equal to images count and I thing it is not corect ones because I move camera and projector in calibration. My new idea is to project chessboard on scanning background, perform calibration and in this way I will have Rotation vector and Transition vector. I don’t know is that correct way.

Scanning


Process of scanning is:
Generate patterns -> undistor patterns with projector matrix -> Project pattern and take photos with camera -> undistort taken photos with camera matrix

Camera-projector pixels map


I use GrayCode pattern and with cv.graycode.getProjPixel and have pixels mapping between camera and projector. My projector is not very high resolution and last patterns are not very readable. I will create custom function that generate mapping without the last patterns.

Problem


I don’t know how to get depth map(Z) from all this information. My confution is because there are 3 coordinate systems – camera, projector and world coordinate system.
How to find ‘Z’ with code? Can I just get Z from pixels mapping between image and pattern?
Information that have:
  • p(x,y,1) = R*q(x,y,z) + T – where p is image point, q is real world point(maybe), R and T are rotation vector and transition vector. How to find R and T?
  • Z = B.f/(x-x’) – where Z is coordinate(depth), B-baseline(distanse between camera and projector) I can measure it by hand but maybe this is not the way, (x-x') – distance between camera pixel and projector pixel. I don’t know how to get baseline. Maybe this is Transition vector?
  • I tried to get 4 meaning point, use them in cv.getPerspectiveTransform and this result to be used in cv.reprojectImageTo3D. But cv.getPerspectiveTransform return 3×3 matrix and cv.reprojectImageTo3D use Q-4×4 perspective transformation matrix that can be obtained with stereoRectify.

Similar Questions:

There are many other resources and I will update list with comment. I missed something and I can’t figure out how to implement it.

Answer:

Lets assume p(x,y) is the image point and the disparity as (x-x’). You can obtain the depth point as,
disparity = x-x_ # x-x'
point_and_disparity = np.array([[[x, y, disparity]]], dtype=np.float32)

depth = cv2.perspectiveTransform(point_and_disparity, q_matrix)

If you have better answer, please add a comment about this, thank you!