7.30. Template matching¶
The detectors covered so far answer questions about the content of a single frame: where the blobs are, where the lines go, what a printed code says. A different class of question compares one image against another. Does this region of the captured frame look like the reference patch I stored at calibration time? The matching methods answer that question.
Tonal and statistical analysis introduced
get_similarity() for the
related question – how alike are these
two same-sized images overall? – with
SSIM as the underlying metric. The
remaining matching question is the
localisation one: not “how alike are
these two images” but “where inside this
larger image does that smaller patch
appear?” The right tool for the
localisation question is template
matching.
7.30.1. The basic call¶
find_template() looks
for the first place a small template
image appears inside the captured frame.
The implementation uses normalised
cross-correlation (NCC): the template
slides across the frame, the per-position
match score is computed from the
correlation between template pixels and
the underlying frame pixels (normalised
against the local means and variances so
that gain changes don’t fool the match),
and the first position whose score clears
threshold is returned as a bounding
box:
template = image.Image("/sdcard/template.bmp", copy_to_fb=False)
template.to_grayscale()
match = img.find_template(template, threshold=0.7,
search=image.SEARCH_DS)
if match is not None:
img.draw_rectangle(match, color=(255, 0, 0))
The method only works on grayscale
images. Capture in grayscale (the natural
choice for any cam without a colour
sensor), or convert in place via
to_grayscale() before
the call. The same applies to the template
loaded from disk: a colour template is
converted with the same method, the result
is what the matcher expects.
threshold is a float from 0.0 to
1.0. A value of 1.0 demands a
perfect pixel-for-pixel match (which never
happens with real captured images), 0.0
accepts anything, and values between 0.6
and 0.8 cover the common case where the
template was captured under similar lighting
and the scene has not changed dramatically. Raise the
threshold to suppress false positives;
lower it to accept noisier matches at the
cost of more spurious hits.
7.30.2. Search strategy¶
search chooses between two strategies.
image.SEARCH_EX is the exhaustive
search: the template slides through every
step-pixel position in the frame and
returns the first hit above threshold.
image.SEARCH_DS is the diamond
search: the matcher samples coarsely
first, then refines around the best score,
which is dramatically faster but can miss
a true match if the coarse pass happened to
land near a local maximum that beats the
global one. For a real-time pipeline where
the template is well-defined and unlikely
to be confused, SEARCH_DS is the right
default; for a one-shot calibration where
the cost of a miss is higher than the cost
of a slower scan, SEARCH_EX is safer.
step controls the pixel skip during the
exhaustive pass (the diamond search manages
its own step). Larger step values speed
up the scan at the cost of sub-pixel
accuracy. roi restricts the search to a
region of the frame, both narrowing what
the matcher considers and reducing work.
The returned value is a (x, y, w, h)
bounding-box tuple identifying the best
match, or None if no position cleared
the threshold. The bounding box drops
directly into
draw_rectangle() or
crop() for the next
stage of processing.
7.30.3. The scale and rotation trap¶
The classical pitfall with template
matching is scale and rotation
sensitivity. The matcher compares the
template against the frame pixel-for-pixel;
a template captured at one distance does
not match the same object captured at a
different distance, and a template
captured straight-on does not match the
same object viewed off-axis. The threshold
quietly drops below the matching level
even when the object is plainly visible to
a human eye, and the method returns
None.
A few workarounds exist for the simple
cases. The application can capture
multiple templates at different scales and
run find_template() for
each in sequence, accepting the first
that clears threshold; the cost scales
with the number of templates. The
application can pre-process the frame
with rotation_corr() or
the polar transform (Geometric transforms)
to remove the offending rotation before
the match runs; the matched template still
has to match the corrected geometry.
A useful idiom for QA-inspection pipelines
pairs the template matcher with the
similarity scorer Tonal and statistical
analysis introduced:
find_template() locates
the part in the captured frame and the
returned bounding box is cropped out and
passed to get_similarity()
against the reference patch. The
template-match step decides where the
part is; the similarity-score step decides
whether the part is acceptable. The two
steps run every frame, the threshold on
mean is the pass/fail gate, and the
matched bounding box drawn back into the
frame is the IDE preview the operator
watches.