7.21. Scale, flip, and crop¶
The previous subsections all worked on pixels at the same positions they started in. The transform family changes that. Scaling sends every input pixel to a different output position, possibly to several output positions at once (when upscaling) or to a position shared with several other input pixels (when downscaling). Flipping and rotating do the same thing through a different mapping. Cropping keeps a rectangular subset of input pixels and discards the rest.
The image module exposes that family through three methods that share most of their arguments and most of their behaviour:
copy()– produce a copy of the image, possibly scaled, cropped, or re-oriented.crop()– the same operation ascopy, but with the expectation that the application is going to pick a sub-rectangle out of the source.scale()– the same again, with the expectation that the application is going to resize the result.
The three share the same arguments and the
same transform machinery; the difference is
where the result lands by default.
copy() produces a new
image, while crop() and
scale() modify the source
in place.
7.21.2. Interpolation: AREA, BILINEAR, BICUBIC¶
When scaling sends each output pixel to a position that does not align with any single input pixel, the method has to pick what value to write. Three flags control how:
image.BILINEAR interpolates between the
four nearest input pixels weighted by their
distance from the output position. The result
is smoother than nearest-neighbour, with no
visible jaggies on diagonal lines, but the
extra arithmetic costs about four times the
nearest-neighbour pass. The right choice for
most upscaling work and for any non-integer
scale factor.
image.BICUBIC interpolates between the
sixteen nearest input pixels using a cubic
curve, which produces still-smoother results
at the cost of more arithmetic again. Best
quality for the cost-sensitive applications
that need it; rarely worth the extra
computation for live frames the IDE will only
display.
image.AREA averages every input pixel that
falls inside the output pixel’s footprint –
the right algorithm for downscaling.
Bilinear and bicubic are interpolators: they
estimate a value between source pixels,
which is what upscaling needs, but when
downscaling each output pixel covers many
source pixels and an interpolator reads only
the few nearest ones – the detail it skips
comes back as aliasing. image.AREA folds
every covered pixel into the average instead.
The default scaling algorithm without any hint is nearest-neighbour, which is the cheapest and the right answer when the source is already at the destination’s pixel resolution.
7.21.3. Orientation: flips and rotations¶
The orientation flags are a small set of boolean transformations that compose freely with each other and with the interpolation flags:
image.VFLIPflips the image vertically (top becomes bottom).image.HMIRRORmirrors it horizontally (left becomes right).image.TRANSPOSEswaps the x and y axes (rows become columns).
Most rotations come from composing those three. The module also exposes named shortcuts:
image.ROTATE_90(=VFLIP | TRANSPOSE)image.ROTATE_180(=HMIRROR | VFLIP)image.ROTATE_270(=HMIRROR | TRANSPOSE)
In code:
img.copy(hint=image.ROTATE_90, copy_to_fb=True)
7.21.4. Aspect handling¶
When the source’s aspect ratio does not match the rectangle it is being drawn into, three flags decide what to do with the mismatch:
image.SCALE_ASPECT_KEEP preserves the
source’s aspect ratio and letterboxes the
result – the source is scaled until it fits
inside the destination, with empty (zero)
pixels filling the rest of the destination.
The right choice when keeping the source
undistorted matters more than filling the
whole output.
image.SCALE_ASPECT_EXPAND preserves the
source’s aspect ratio and crops it – the
source is scaled until it fills the
destination, with the parts that extend past
the destination cut off. The right choice
when filling the whole output matters more
than seeing every part of the source.
image.SCALE_ASPECT_IGNORE ignores the
aspect ratio and stretches the source to fill
the destination, accepting whatever
distortion that introduces. The right choice
when the application has already accounted
for the distortion – when the destination’s
dimensions are not actually a rectangle of
the same scene, for example.
The default (no aspect flag set) is the same
as SCALE_ASPECT_IGNORE: stretch to fill.
Applications that care about aspect ratio
specify one of the three explicitly.
7.21.5. When to reach for which¶
Most resizes use scale()
with an x_scale / y_scale pair and an
interpolation hint:
img.scale(x_scale=0.5, y_scale=0.5, hint=image.AREA)
Most rotations use the same call with
hint=image.ROTATE_90 or similar.
Cropping uses crop() with
a non-default roi:
img.crop(roi=(40, 30, 200, 150))
When the source has to survive the operation
– capturing a reference frame, taking a
thumbnail of a frame that is about to be
processed destructively –
copy() produces the result
as a new image and leaves the source
untouched:
thumbnail = img.copy(x_scale=0.25, y_scale=0.25, hint=image.AREA)
That default is the real difference behind
the three names: scale and crop
transform in place, copy allocates. The
result-placement keywords bridge the gap:
copy=True on scale or crop
allocates the result as a separate heap
buffer instead of overwriting the source, and
copy_to_fb=True on any of the three lands
it in the frame buffer for the IDE preview.