7.20. Regression and similarity¶

Two more measurements on the Image class summarise the image as something other than a distribution of pixel values. Linear regression of thresholded pixels gives an application a line it can act on – the classical input to a line-following robot. Similarity measurement gives an application a single number describing how alike two images are – the natural input to a golden-image regression test or a gross-change detector.

7.20.1. Linear regression¶

When the foreground pixels of an image happen to form a line across the frame – the tape on a track that a robot is following, the line of a horizon, the edge of a road or a corridor – the application usually does not want every individual foreground pixel. It wants the best-fit line through them all, parameterised so it can decide how the line is oriented and where it crosses the frame.

get_regression() does that fit. It takes the same threshold tuple form that binary() and find_blobs() use, identifies every pixel that matches the threshold, and returns a single line result describing the best-fit line through those pixels:

line = img.get_regression([(0, 60)])
if line:
    img.draw_line((line.x1(), line.y1(),
                   line.x2(), line.y2()),
                  color=(255, 0, 0))

The fit is Theil-Sen linear regression – a robust method that tolerates outliers better than the more familiar least-squares fit. A small handful of pixels far from the true line do not skew the result the way they would with least squares, which matches the noisy-foreground reality of a real threshold output.

The line result carries the endpoints clipped to the image rectangle (x1, y1, x2, y2), the line length and magnitude (length, magnitude), and the line’s geometric description in polar form (theta, rho) – the angle of the line from horizontal and its perpendicular distance from the origin. The polar form is what a control loop usually wants: theta tells the robot which way the line is leaning, rho tells it where the line crosses the image, and a feedback loop on the two keeps the robot centred on the line.

A handful of keyword arguments tune the robustness and the cost. x_stride and y_stride skip pixels during the fit – larger strides make the regression cheaper at the cost of fitting fewer pixels. area_threshold and pixels_threshold reject lines that do not have enough matching pixels behind them. target_size re-scales the input to a smaller size before fitting – the regression runs faster on a 80-by-60 surrogate of the image without much loss in line direction accuracy.

If no acceptable line could be fit – if the threshold matched no pixels, or matched a pattern that does not look like a line – the method returns None. Real line-following code guards every get_regression() call with a None check before reaching for the line’s attributes.

7.20.2. Image similarity¶

A different kind of measurement: instead of asking “what does the image contain?”, ask “how alike are these two images?”. The operation to reach for is get_similarity(), which computes the Structural Similarity Index (SSIM) between the source image and a reference image.

s = img.get_similarity(reference)
print(s.mean, s.stdev)

SSIM is the standard image-similarity metric used across image processing because it behaves the way a human’s intuition about similarity behaves – a small shift or a small brightness change reduces the score slightly, while a large structural change (missing object, different scene) reduces it dramatically. The score ranges from -1 to +1: +1 means the two images are identical, 0 means they are unrelated, and -1 means they are structurally opposite. A returned similarity object exposes the mean SSIM across the image, plus the standard deviation, min, and max of the per-tile scores.

For the kind of comparison where a small number is better than a large one – a regression test that should report zero on “nothing changed” and rise as changes accumulate – the dssim=True flag returns the structural dissimilarity: the mean SSIM subtracted from 1, so the return value is 0.0 for identical images and rises as they differ.

7.20.3. Use cases for SSIM¶

The two common applications:

Golden-image regression testing. A test framework captures a reference frame under known-good conditions and stores it as the golden image. Subsequent test runs capture under the same conditions and compare against the golden image with SSIM. A score above some threshold (0.95 or 0.98 depending on tolerance) is a pass; below is a fail. The test framework does not need to know what changed – the SSIM score is the signal.

Gross change detection. An application that wants a coarser version of frame differencing – one that ignores small brightness changes but reacts to large structural changes – can use SSIM against a reference frame instead of the per-pixel difference() followed by a threshold. SSIM is less sensitive to lighting drift than per-pixel differencing, which makes it the better choice when the goal is to detect “the scene looks materially different” rather than “any individual pixel changed.”

Both applications use the same call – img.get_similarity(reference) – and trigger on a threshold of the returned score. The difference is just whether the threshold is high (regression test, looking for a near-identical match) or low (change detection, looking for any large structural change).

7.20.4. The transform-and-compare form¶

A useful subtlety: get_similarity() accepts the same x, y, x_scale, y_scale, roi, rgb_channel, alpha, color_palette, alpha_palette, hint, and transform parameters as draw_image(). The reference image is positioned, scaled, and transformed by those parameters before the SSIM comparison runs.

That means an application can ask “how similar is this scene to a reference frame after a known displacement / rotation / scale” without preparing a pre-transformed reference image. It is the cheap way to build a tracker that searches a parameter space and reports which transform of the reference best matches the current frame.