7.32. Saving and compression

Every page until now has worked with images on the cam: captured into the frame buffer or allocated on the MicroPython heap, manipulated through the image-module methods, and either displayed in the IDE preview or fed into a downstream stage in the same script. Most applications need at some point to do the opposite: take an image that is currently in RAM and put it somewhere persistent – onto the SD card, onto a USB host, over a network – where something other than the camera can read it.

The image module exposes two paths for that work. The save path writes the image to a file on the filesystem, with the file format chosen by the extension and the encoding details handled by the method. The to-format path returns a Image object containing the encoded byte stream, suitable for handing to a streaming or networking call without ever touching the filesystem. Each fits a different application; both build on the same compression engine underneath.

7.32.1. Saving to a file

save() writes the image to the filesystem at a path:

img.save("/sdcard/capture.jpg")
img.save("/sdcard/capture.bmp")
img.save("/sdcard/region.jpg", roi=(40, 60, 200, 150), quality=85)

The format is picked from the file extension. Five extensions are recognised: .bmp writes a Windows bitmap (lossless, no compression, byte-for-byte the captured pixels); .pgm writes a portable graymap (lossless, grayscale only); .ppm writes a portable pixmap (lossless, RGB); .jpg and .jpeg both write a JPEG (lossy, compressed). The receiver image must already be in the right colour format for the chosen container – a colour image saved as .pgm is an error.

roi restricts the save to a sub-rectangle of the image, the way every other image-module method’s roi keyword does. The full image is the default. The keyword is ignored when saving a JPEG-compressed image because the on-disk form already covers the full frame and re-encoding through a crop would defeat the point of saving the existing compressed bytes.

quality is the JPEG compression quality from 0 to 100 and is only meaningful when the output is JPEG (the keyword is ignored for the lossless formats). The default of 50 is the right balance for most applications; 70 to 85 is the band for higher visual quality, 30 to 50 is the right range for small thumbnails and bandwidth-constrained transmission, and 90 and up is reserved for cases where the image will be inspected manually or run through a downstream algorithm sensitive to compression artefacts.

The receiver image is returned so the call chains: img.save("/sdcard/x.jpg").draw_string(0, 0, "saved"). The returned object is the same in-memory image; the save is a side effect.

A typical use is the capture-and-log pattern. A trigger fires (a blob is detected, a button is pressed, a timer elapses); the script captures a frame; it appends a timestamp to the filename; and it calls save() to push the image to the SD card. The IDE preview keeps running, the next trigger fires, and the saved files accumulate.

7.32.2. Encoding to memory

When the destination is not the filesystem but a network connection, a serial port, or another module’s input, the application needs the encoded byte stream in memory rather than on disk. to_jpeg() and to_png() produce exactly that:

encoded = img.to_jpeg(quality=80, copy=True)
bytes_to_send = encoded.bytearray()
sock.send(bytes_to_send)

The default behaviour is in-place conversion: the receiver is converted into a JPEG (or PNG) image and the same object is returned. With copy=True the conversion writes into a freshly-allocated heap object; with copy_to_fb=True the output lands in the frame buffer. The choice is the same one any other conversion method offers – in-place by default, copy when the original is needed afterwards.

quality and subsampling are the same JPEG tuning knobs the save path exposes. subsampling chooses the chroma-subsampling scheme: image.JPEG_SUBSAMPLING_AUTO picks the best for the chosen quality, image.JPEG_SUBSAMPLING_444 keeps chroma at full resolution (largest file, best colour accuracy), image.JPEG_SUBSAMPLING_422 and image.JPEG_SUBSAMPLING_420 halve the chroma resolution along one or both axes (smaller files, slight colour softening that is invisible at typical viewing distances). The default of AUTO is the right choice unless the application has a specific need.

PNG via to_png() is lossless but slower to encode and produces larger files than JPEG for photographic content (photographic content compresses badly under PNG’s prediction scheme). Use PNG when the image is line art, a screenshot, or contains hard-edged graphics drawn over a captured frame – the lossless encoding preserves the sharp edges that JPEG would soften. Otherwise JPEG is the right default.

Both to_jpeg() and to_png() accept the same drawing-style positional and scale keywords other conversion methods take – x_scale, y_scale, roi, rgb_channel, alpha, color_palette, alpha_palette, hint – so the same call can encode a scaled, cropped, or palette-mapped version of the source in one step. compress() is the legacy spelling of to_jpeg(); the two take the same arguments and produce the same result.

7.32.3. What compression buys

The numbers behind the JPEG-versus-raw trade-off are worth working through once.

A 320-by-240 RGB565 frame is 153,600 bytes (one captured frame at QVGA). A 640-by-480 frame is 614,400 bytes; a 1280-by-960 frame is 2,457,600 bytes. None of those are large compared to a desktop or phone display, but they are substantial in the context of a cam that has a few MB of RAM total, an SD card with a finite write bandwidth, and a host link that is typically running over USB CDC, a UART, or a wireless module at modest speeds.

JPEG at quality=50 typically compresses a photographic captured frame by 10x to 20x: that 614 KB 640-by-480 frame becomes a 30 to 60 KB encoded byte stream. At quality=85 the compression drops to 5x to 10x (60 to 120 KB for the same frame). At quality=10 – artefact-laden but still recognisable – the compression reaches 30x to 50x (12 to 20 KB).

Those numbers determine what is practical to do with the saved frames. An SD card path sustaining 10 MB/s handles 30 frames per second of quality=50 JPEG-encoded VGA content with room to spare (about 1 to 2 MB/s); saving the same content uncompressed requires over 18 MB/s, past what the cam’s filesystem path sustains to the card. A USB host pulling JPEG-encoded frames over CDC at 1 MB/s receives 30 to 60 KB frames at roughly 15 to 30 frames per second; pulling raw frames at the same rate it gets one or two frames a second.

In short: the compression methods are not just a convenience for saving. They are what makes the captured frame usable outside the cam at frame rates the application cares about. Choosing the right compression – JPEG quality 50 for general logging, 80 for quality work, PNG for line-art capture – is part of the routine work of any non-trivial cam application.