7.11. Frame differencing¶

Frame differencing compares each new frame against a stored reference frame to find the parts of the scene that have changed. It is the workhorse of camera applications that watch for something happening – motion-triggered capture, intrusion alerts, “save a video when something moves” – and it is built entirely from the pixel-wise operations covered earlier: an absolute difference, a threshold, and a region search, run on every frame.

7.11.1. The basic pipeline¶

The first stage is to acquire a reference. At some point near startup – ideally when the scene is in the state that “no change” means – the application captures a frame and keeps it. The frame becomes the baseline that every subsequent capture will be compared against.

reference = csi0.snapshot().copy()

The .copy() matters. csi0.snapshot() by itself returns an Image whose buffer lives in the frame buffer, where the next call to snapshot will overwrite it. .copy() allocates a separate buffer for the reference and lets the pixels of this frame survive past the next capture.

The second stage runs on every frame: capture a fresh image, then compute the absolute difference between it and the reference. That is exactly what difference() does:

current = csi0.snapshot()
current.difference(reference)

After this call, current holds an image whose non-zero pixels mark every position where the scene changed since the reference was taken, with the magnitude of each pixel proportional to how much it changed at that position.

The third stage thresholds the difference image. The raw difference always contains some noise: small brightness variations from sensor shot noise, gradient changes from lighting drift, sub-pixel jitter from slight camera motion. A threshold pass – binary() with a threshold set above that noise floor – keeps only the changes large enough to count as real motion and discards the rest, producing a binary image whose non-zero pixels are the actually-changed positions.

The fourth stage extracts connected regions of that binary mask – groups of adjacent non-zero pixels that form contiguous patches. find_blobs() does that in one call, returning a list of motion regions, each with a bounding box and a pixel count, that the rest of the application can act on.

The frame-differencing pipeline: a reference frame plus a current frame become a difference image; thresholding turns the difference into a binary mask of changed positions; a connected-region step turns the mask into a list of motion regions.¶

7.11.2. In-memory and on-disk references¶

The basic pipeline keeps the reference frame in RAM. That is the right answer when the reference is captured this run of the script and only has to survive for as long as the script keeps running.

For a long-running application – a cam that should resume change detection after a power cycle, an intermittent script that needs to detect any change since some earlier moment – the reference frame has to outlive the running script. The pattern is to save the reference to disk:

csi0.snapshot().save("/sdcard/reference.bmp")

and to load it back at the start of each run:

reference = image.Image("/sdcard/reference.bmp")

The differencing logic does not change; only where the reference lives between captures does. A few refinements naturally extend this on-disk variant – automatic re-capture of the reference on a timer, optional rolling averages to track slow lighting drift – but the substitution at the centre is the same.

7.11.3. Light-source isolation¶

The same subtraction pattern shows up in a slightly different setting: isolating a light source against the rest of the scene. The trick is to capture a “lights-off” reference – a frame taken when whatever is being detected (an IR beacon, a screen pixel, a status indicator) is not illuminated – and to subtract that reference from each subsequent frame. The result has zero brightness everywhere the scene was the same in both captures, and non-zero brightness only where the light source actually lit up.

7.11.4. Choosing difference or sub¶

A practical note about which arithmetic operation to pick. difference() returns the absolute value of the change – sign-free – which makes it sensitive to change in either direction (brightening or darkening) at the cost of not telling the application which direction the change went. For pure motion detection that is the right answer: anything that moved is interesting, regardless of which way the brightness shifted.

For light-source detection, the lit pixel is always brighter than the lights-off reference, so sub() (with its clipping at zero) is the more honest choice. Anywhere the current frame is darker than the reference (which would be sensor noise around the unlit value) clips to zero rather than reporting a spurious “the light was on” signal.