7.23. Perspective correction¶
Warning
The arbitrary 3-by-3 transform matrix
is only supported on the OpenMV Cam N6
– the keyword is silently ignored on every
other board. Applications that need to run
anywhere else must use the canned
rotation_corr() method
(with its corners= form) or pre-compute
the corrected image off-board.
The canned rotation_corr()
method packages a particular family of
perspective warps behind a small set of
parameters, and runs on every supported
board. Some applications need a warp that
does not fit that form: an arbitrary
projective remap from one quadrilateral to
another, a calibrated correction for a known
mounting that has already been worked out
off-line, a warp matrix handed over
ready-made by some upstream algorithm. For those,
draw_image() – along with
copy(),
crop(), and
scale() – accepts a
transform keyword that takes a hand-built
3-by-3 matrix describing the warp directly.
7.23.1. Affine and projective transformations¶
Geometric warps are expressed in homogeneous
coordinates: the pixel position (x, y)
with a 1 appended, multiplied by a 3-by-3
matrix.
The affine form is the place to start. Its bottom row is fixed at \((0, 0, 1)\):
Written out, each output coordinate is a linear combination of the input coordinates plus a constant:
which covers scaling, rotation, shearing, and translation in any combination – and under all of them, parallel lines stay parallel.
The projective (perspective) form frees the bottom row:
Written out:
The division by \(w' = g x + h y + 1\) is what makes the transformation projective rather than merely affine. When \(g\) and \(h\) are both zero, \(w'\) stays at one and the division does nothing – the affine form again. When either is non-zero, \(w'\) varies with the input position and pixels at different positions get foreshortened by different amounts, which no longer keeps parallel lines parallel – it is exactly the keystone effect of looking at a flat plane from an oblique angle. A projective transformation is the most general geometric warp that takes straight lines to straight lines; scaling, flipping, transposing, rotating, and the four-corner rotation correction are all special cases of one.
The named transformations drop out of the affine form directly. The identity transformation is the identity matrix, and:
For most hand-built transforms an application starts with one of these as a base and multiplies in further matrices for each additional operation, ending with a single 3-by-3 matrix that describes the composite warp. Matrices apply right to left: \(M = T R S\) runs the scale first, then the rotation, then the translation. The composite everyone needs eventually is rotation about the image centre – a bare rotation matrix spins the image about the pixel origin at the top-left corner, so the centred version moves the centre \((c_x, c_y)\) to the origin, rotates, and moves it back:
7.23.2. The transform keyword¶
The matrix goes in through a transform
keyword, supplied as a 3-by-3
ulab.numpy.ndarray. The method to
reach for is draw_image(),
which warps the source through the matrix as
it draws it onto a destination – the result
lands in a buffer the application controls,
and the warp composes with everything else on
the call: the scaling, the alpha blending,
the masking.
import ulab.numpy as np
M = np.array([[1.2, 0.0, -20.0],
[0.0, 1.2, -15.0],
[0.0, 0.0, 1.0]])
canvas.draw_image(img, transform=M)
The example warps img onto canvas
scaled by 1.2 in each direction and shifted
left and up by 20 and 15 pixels respectively
– an affine warp built directly from the
matrix entries described above. The same
keyword on copy(),
crop(), and
scale() applies the warp
to the image itself.