7.17. A catalogue of standard kernels¶
Classical image processing has accumulated a
fair-sized catalogue of kernel weight patterns
that come up over and over – edge detectors,
sharpeners, embosses, smoothers, motion blurs
– and every one of them runs through
morph(). Each is short, each
does one thing, and most are straightforward to
read once the basic logic of the weights makes
sense.
The kernels below are all 3-by-3 unless noted,
so they all use size=1 in the call. Each
kernel’s weight structure is described
alongside it, since reading the weights is what
builds the intuition for why one kernel
embosses and another sharpens.
7.17.1. The identity kernel¶
The simplest possible kernel is the identity – one in the centre, zero everywhere else:
identity = [0, 0, 0,
0, 1, 0,
0, 0, 0]
img.morph(1, identity)
Each output pixel takes its value from the centre of the neighbourhood, which is the input pixel at the same position. The image passes through unchanged. The identity has no practical use as a filter, but it is the useful baseline for understanding every other kernel: any non-identity kernel is the identity plus some modification.
A kernel whose centre weight is large with small negative weights around it subtracts the surround from the centre. A kernel with a zero centre weight ignores the pixel itself and responds only to differences among its neighbours. Reading a kernel this way – what the centre weight does to the pixel, what the surrounding weights add or take away – is the fastest way to predict its effect.
7.17.2. Edge detection¶
Edge-detection kernels respond strongly to positions where brightness is changing rapidly in a particular direction, and produce near-zero output where brightness is uniform. They are the family whose weights sum to zero: a flat patch (every pixel the same value) produces zero output, because every positive weight is exactly cancelled by an equal-magnitude negative weight.
Sobel-x is the canonical example. It detects vertical edges (left/right brightness transitions):
sobel_x = [-1, 0, 1,
-2, 0, 2,
-1, 0, 1]
img.morph(1, sobel_x, mul=0.25, add=128)
The matching Sobel-y is the same pattern rotated 90 degrees; it detects horizontal edges (up/down brightness transitions):
sobel_y = [-1, -2, -1,
0, 0, 0,
1, 2, 1]
The middle row of Sobel-x has weights -2
and 2 rather than -1 and 1. The
extra weight on the centre row gives the kernel
a small built-in smoothing in the direction
along the edge, which makes it more robust
against noise than the simpler Prewitt
operator that drops those extra magnitudes:
prewitt_x = [-1, 0, 1,
-1, 0, 1,
-1, 0, 1]
prewitt_y = [-1, -1, -1,
0, 0, 0,
1, 1, 1]
Prewitt weighs every row equally, so its response is a touch sharper than Sobel’s, at the cost of being more sensitive to single-pixel noise (the cost of running the kernel is identical – the convolution does the same work whatever the weights are). On a clean image with strong edges, it is a perfectly serviceable substitute for Sobel.
Scharr goes the other direction. Its weights are larger and tuned for accurate detection of edge direction at finer angles:
scharr_x = [-3, 0, 3,
-10, 0, 10,
-3, 0, 3]
img.morph(1, scharr_x, mul=0.0625, add=128)
The mul=0.0625 divisor (1/16) brings the
output back inside 0 – 255 after the
larger sum-of-products. Scharr is the right
answer when the application needs the most
geometrically faithful gradient response and is
willing to pay slightly more arithmetic for it.
7.17.3. The Laplacian¶
A Laplacian kernel responds to edges in any direction at once. Where the Sobels each detect brightness changes along one axis, the Laplacian’s symmetric weight pattern responds the same way regardless of which direction the edge is going:
laplacian_4 = [ 0, -1, 0,
-1, 4, -1,
0, -1, 0]
img.morph(1, laplacian_4, add=128)
The structure: centre weight 4, four
horizontal/vertical neighbours weighted -1,
the four diagonals weighted zero. The kernel
sums to zero, so flat patches produce zero
output. Where brightness is changing, the
centre value differs from the average of its
four cardinal neighbours, and the output is the
size of that difference.
The 8-connected variant includes the diagonal neighbours:
laplacian_8 = [-1, -1, -1,
-1, 8, -1,
-1, -1, -1]
Each kernel detects slightly different things. The 4-connected version produces cleaner output on horizontal and vertical edges; the 8-connected one is more isotropic – it responds equally well in every direction – but produces slightly noisier output. The 8-connected kernel also circulates under the name outline, after its use for visualising edges.
7.17.5. Emboss¶
An emboss kernel produces the lit-from-the-side effect found in classical image editors. The output looks like the image was extruded into a relief and then lit from one corner:
emboss = [-2, -1, 0,
-1, 1, 1,
0, 1, 2]
img.morph(1, emboss, add=128)
The trick is the asymmetry across the diagonal.
The upper-left corner has the most negative
weight, the lower-right has the most positive
weight, and the diagonal from corner to corner
runs from negative through one to positive. At
each pixel the kernel essentially computes
“brightness on my lower-right minus brightness
on my upper-left,” which is positive where the
image gets brighter in that direction and
negative where it gets darker. Adding 128
recentres the signed output to mid-grey so the
effect is visible.
Rotating the asymmetry across the other diagonal embosses from the opposite direction:
emboss_alt = [ 0, 1, 2,
-1, 1, 1,
-2, -1, 0]
img.morph(1, emboss_alt, add=128)
The two emboss directions are useful in combination – subtracting one from the other, or running each on the same image and comparing the responses – when an application needs to detect orientation.
7.17.6. Smoothing¶
Smoothing kernels are the family whose weights sum to one (and are all non-negative). A flat patch through such a kernel produces the same flat brightness, because the kernel averages the pixel values together rather than amplifying their differences.
The simplest is the box blur, which is
exactly what mean() computes:
box_blur = [1, 1, 1,
1, 1, 1,
1, 1, 1]
img.morph(1, box_blur)
The kernel sums to 9, so the auto-division
by the kernel sum turns the sum-of-products
into a true average over the nine
neighbourhood pixels. In practice
mean() is the better way to
run this kernel – it produces the same output
faster, through a path optimised for computing
the mean and nothing else, where morph runs
the general convolution machinery. The box blur
is in the catalogue because it is the right
baseline for understanding every other
smoothing kernel.
A 3-by-3 approximation of the Gaussian weights the centre and the cardinal neighbours more than the corners:
gaussian = [1, 2, 1,
2, 4, 2,
1, 2, 1]
img.morph(1, gaussian)
The weights are the Pascal-triangle row
1, 2, 1 outer-producted with itself. The
centre weight 4 is the largest because the
centre pixel contributes most to its own
output; the corners are 1 because they are
the farthest from the centre. The kernel sums
to 16, and the auto-division by the kernel
sum handles the normalisation – no mul
argument needed. The 3-by-3 form is a coarse
approximation of a true Gaussian and
indistinguishable from
gaussian() at size=1;
the morph form is mostly useful when an
application wants to compose the smoothing
with another operation in the same pass.
7.17.7. Motion blur¶
A motion-blur kernel averages pixels along one direction, leaving the perpendicular direction unblurred. The simplest case is horizontal:
motion_h = [0, 0, 0,
1, 1, 1,
0, 0, 0]
img.morph(1, motion_h)
The middle row averages three pixels along the
horizontal axis; the top and bottom rows are
zero. The kernel sums to 3, so the
auto-division by the kernel sum produces a
true three-pixel average without any mul
needed. The output is a horizontally-smeared
copy of the input – the effect a camera
captures when the subject is moving sideways
during the exposure. The vertical motion blur
is the same pattern rotated:
motion_v = [0, 1, 0,
0, 1, 0,
0, 1, 0]
A diagonal motion blur uses the main diagonal:
motion_diag = [1, 0, 0,
0, 1, 0,
0, 0, 1]
img.morph(1, motion_diag)
Motion blur kernels are useful both as an effect (deliberately blurring a frame for visual purposes) and as a test pattern for algorithms that need to be robust against motion artefacts (run the algorithm on a motion-blurred input and check that it still produces the right answer).
7.17.8. Reading kernels at a glance¶
A few rules of thumb make new kernels easier to read at sight:
Sum-to-one with non-negative weights ⇒ smoothing (preserves average brightness).
Sum-to-zero with both positive and negative weights ⇒ edge response (zero on flat patches).
Sum-to-one with a large positive centre and small negative surrounds ⇒ sharpening (identity plus edge response).
Asymmetric across a diagonal with sum to one ⇒ embossing (highlights one side of every brightness transition).
Concentrated along one axis with sum to one ⇒ directional blur.
The first one of these the kernel matches is usually the right guess at what it does. Most useful kernels are recognisable from the layout of their weight pattern alone.
When none of the standard kernels does what
the application wants, the next step is to
hand-tune one. The combination of the rules
above and the mul / add controls covers
nearly every linear pass that a classical
machine vision pipeline has ever wanted; from
there it is a matter of trying weights, looking
at the output, and iterating.