7.26. Finding lines and segments

Some scene features are not connected regions of colour but oriented straight edges: a painted line on the floor, the seam between two surfaces, the side of a printed rectangle, the edge of a doorway. Asking the blob detector to find them is the wrong question – the edge is one pixel wide, the blob algorithm wants area-with-colour, and the answer comes back empty or noisy.

The right detector for oriented edges is the Hough line transform. The image module exposes it in two flavours: find_lines() returns infinite lines (every line extends across the full image); find_line_segments() returns finite segments (each line has endpoints inside the frame). Which one the application needs depends on whether the edges of interest are continuous across the whole frame or only span part of it.

7.26.1. How the Hough transform works

Both detectors share the same core idea, so it pays to understand it once. The image module first runs a Sobel-style edge filter on the input to score every pixel by how likely it is to lie on an oriented edge. Each such edge pixel then votes for all the lines it might lie on. The lines that collect the most votes win.

A line is parameterised in Hough space by two numbers: theta, the angle of the line (0 – 179 degrees), and rho, the perpendicular distance from the image origin to the line (signed, in pixels). Every line the image contains is one point in (theta, rho) space. Each edge pixel in the input contributes one vote to every (theta, rho) combination consistent with its position – conceptually, a curve through Hough space. Where many such curves cross, many edge pixels agree on the same line, and that crossing is a detection.

The detector returns the local maxima in Hough space whose vote totals exceed a threshold. Each returned Line carries both representations: x1, y1, x2, y2 for the endpoint form (clipped to the image bounds for the infinite case), theta, rho for the Hough form, and length and magnitude for size and vote-count respectively.

7.26.2. Infinite lines

find_lines() runs the Hough transform and returns the strongest lines, each extended across the full image:

lines = img.find_lines(threshold=1500, theta_margin=25, rho_margin=25)

for l in lines:
    img.draw_line(l, color=(255, 0, 0))

The threshold is the minimum vote total for a line to be accepted. The vote total adds up the Sobel edge magnitudes of every contributing pixel, so larger threshold values demand longer or stronger edges to pass – which makes the right value depend on the image resolution (a longer line at a higher resolution accumulates more votes) as well as the scene, so it has to be tuned for the particular application. As rough starting points to tune from: 1000 for a modest line in a clear image, 500 or below for weak contrast or short lines, 2000 or more for busy scenes where false-positive lines form through clusters of edge noise.

theta_margin and rho_margin control merging of nearby maxima. A single physical edge produces a small cluster of high-vote bins around its true (theta, rho), and the detector collapses each cluster to its peak before returning. theta_margin=25 (degrees) merges any peaks within 25 degrees of orientation; rho_margin=25 (pixels) merges peaks within 25 pixels of distance. The defaults are reasonable; raising them returns fewer, more-distinct lines and lowering them returns more, sometimes duplicated lines.

x_stride and y_stride step through edge pixels during voting, the same way they step through pixels in find_blobs(). The defaults of 2 and 1 work for the common case; raising them speeds up the search at the cost of resolution. roi restricts the search to a region of the frame, which both narrows the lines returned and reduces work.

Each returned line is drawable directly: the Line object passes straight into draw_line(), which reads the (x1, y1, x2, y2) endpoint fields off the front of it. l.theta is the angle in degrees, which classifies the line as horizontal, vertical, or diagonal in one comparison. l.magnitude is the vote total, which sorts the returned lines from strongest to weakest.

7.26.3. Line segments

find_lines() is the right detector for edges that span the whole frame, but many real edges – the left side of a printed barcode, the top edge of a label, the visible side of a ruler – only run across part of the image. find_line_segments() returns finite segments whose endpoints are inside the frame:

segments = img.find_line_segments(merge_distance=5, max_theta_difference=10)

for s in segments:
    img.draw_line(s, color=(0, 255, 0))

The segment detector traces along oriented edge pixels directly, rather than voting in Hough space, and the result is a collection of short straight runs. merge_distance sets the maximum pixel gap that two collinear short runs can span and still merge into one returned segment; max_theta_difference sets how many degrees of orientation the merger tolerates between adjacent runs. A generous merge (merge_distance=10, max_theta_difference=15) returns a small number of long segments at the cost of sometimes bridging genuinely separate edges; a strict merge (merge_distance=0, max_theta_difference=5) returns many short segments and lets the application sort them out in Python.

The result objects are the same Line type as find_lines() returns, with the same properties, so a pipeline can process either kind of detection through the same downstream code path. The only practical difference is that the segments’ endpoints are the actual ends of the line in the image, whereas the infinite lines’ endpoints are wherever the line crosses the image border.

7.26.4. When to use each

The choice between the two methods comes down to a single question: does the application care where the line stops?

find_lines() is the right tool when the answer is no. A line-following robot needs to know which way the line is going and where it crosses the bottom of the frame; the line itself runs to the horizon and beyond. A horizon detector wants the strongest oriented edge in the image; it does not need to know where the horizon ends.

find_line_segments() is the right tool when the answer is yes. Identifying the four sides of a printed rectangle needs four segments with known endpoints. Tracking a finger pointing at a display means following a short segment whose endpoints are the finger’s tip and base. Measuring the length of a visible scratch needs the segment’s actual extent in pixels.

Both detectors share a common limitation: they need contrast. The Sobel edge filter they build on responds to brightness gradients; a coloured edge against an equally bright background (a red line on a green wall of the same luminance) produces no gradient and no detection. When that case shows up in practice, the fix is to extract a single LAB channel as a grayscale image with the right contrast before searching – to_grayscale() with the b channel selected isolates red against green where the luminance channel alone is flat – and hand that channel image to the line detector.