image — machine vision

The image module is the heart of the OpenMV machine-vision stack. It exposes the Image class – the in-memory pixel buffer that every drawing, filtering, transformation and feature- extraction routine operates on – together with the supporting result objects returned by those routines (Blob, Line, Circle, Rect, QRCode, AprilTag, DataMatrix, BarCode, …) and the helper classes used to configure them (Threshold, Histogram, Statistics, HaarCascade, Similarity, Percentile, Displacement, ImageIO).

Acquiring an Image

There are four ways to get an Image into RAM:

  • Live capture from the camera sensor. Call csi.CSI.snapshot() to capture the next frame straight into the frame buffer; the returned Image references that buffer.

  • From a file. Pass a path to the Image constructor (image.Image("/sd/photo.jpg")); supported on-disk formats are BMP, PPM/PGM, JPEG, PNG and the OpenMV ImageIO recording format.

  • From an ndarray. Pass a float32 (h, w) or (h, w, 3) ndarray to the Image constructor. The pixels are scaled from 0.0 -- 255.0 into a GRAYSCALE or RGB565 image respectively. Use this to bring tensor output from ml (or any ulab pipeline) back into a drawable image.

  • Empty buffer. Construct an Image with a given size and pixel format (image.Image(320, 240, image.RGB565)) to draw into from scratch, or to use as a scratch surface for image arithmetic.

Pixel formats

Every Image has one of the following pixel formats; the choice trades off memory, processing cost and what algorithms can run on it. Use BINARY, GRAYSCALE, RGB565, BAYER, YUV422, JPEG or PNG as the pixformat argument when constructing an image or configuring the camera sensor:

  • BINARY (1 bpp) – one bit per pixel. The smallest format; used internally by thresholding and morphology routines but rarely captured directly from the sensor.

  • GRAYSCALE (8 bpp) – one byte per pixel (the Y channel of YUV422). Fastest format for most computer-vision algorithms (AprilTag, edge detection, optical flow).

  • RGB565 (16 bpp) – two bytes per pixel, 5-bit red / 6-bit green / 5-bit blue. The default colour format.

  • BAYER (8 bpp) – raw Bayer-pattern colour data straight off the sensor. Useful for custom de-mosaicing or for storing more pixels in less memory before debayering on demand.

  • YUV422 (16 bpp) – 4:2:2 chroma-subsampled colour, two bytes per pixel. Useful when you want chroma-specific algorithms without paying the full RGB cost.

  • JPEG / PNG – compressed buffers. Best for storage and network transmission. Pixel-level operations require Image.to_grayscale() or Image.to_rgb565() first.

Working with results

The detection / feature-extraction methods on Image return objects you can iterate over and combine – a Image.find_blobs() call returns a list of Blob, a Image.find_apriltags() call returns a list of AprilTag, etc. Each result class exposes the geometric properties of the detection (centroid, bounding box, area, code value, etc.) so you can act on them directly or pass them back into drawing methods (Image.draw_rectangle(), Image.draw_string(), …).

Color-space helpers

The module also exposes small pure functions for converting individual pixel values between the binary / grayscale / RGB / LAB / YUV colour spaces. These are useful when you need to convert threshold values or palette entries in Python before passing them into image operations – for full-image conversion use the Image to_* methods, which are much faster than calling these helpers in a loop.

Classes

Functions

Color-space conversion helpers

Each of the X_to_Y functions below performs a single pixel-value conversion. They all take/return values in the canonical OpenMV ranges:

  • binary – int 0 – 1.

  • grayscale – int 0 – 255.

  • RGB – (r, g, b) tuple of 8-bit integers (each 0 – 255).

  • LAB – (l, a, b) tuple with L in 0 – 100 and A/B in -128 – 127.

  • YUV – (y, u, v) tuple with Y in 0 – 255 and U/V in -128 – 127.

For full-image conversion use the Image to_* methods, which are much faster than calling these helpers in a loop.

image.binary_to_grayscale(value: int) int

Convert a binary value to a grayscale value.

image.binary_to_rgb(value: int) Tuple[int, int, int]

Convert a binary value to an RGB tuple.

image.binary_to_lab(value: int) Tuple[int, int, int]

Convert a binary value to a LAB tuple.

image.binary_to_yuv(value: int) Tuple[int, int, int]

Convert a binary value to a YUV tuple.

image.grayscale_to_binary(value: int) int

Convert a grayscale value to a binary value.

image.grayscale_to_rgb(value: int) Tuple[int, int, int]

Convert a grayscale value to an RGB tuple.

image.grayscale_to_lab(value: int) Tuple[int, int, int]

Convert a grayscale value to a LAB tuple.

image.grayscale_to_yuv(value: int) Tuple[int, int, int]

Convert a grayscale value to a YUV tuple.

image.rgb_to_binary(value: Tuple[int, int, int]) int

Convert an RGB tuple to a binary value.

image.rgb_to_grayscale(value: Tuple[int, int, int]) int

Convert an RGB tuple to a grayscale value.

image.rgb_to_lab(value: Tuple[int, int, int]) Tuple[int, int, int]

Convert an RGB tuple to a LAB tuple.

image.rgb_to_yuv(value: Tuple[int, int, int]) Tuple[int, int, int]

Convert an RGB tuple to a YUV tuple.

image.lab_to_binary(value: Tuple[int, int, int]) int

Convert a LAB tuple to a binary value.

image.lab_to_grayscale(value: Tuple[int, int, int]) int

Convert a LAB tuple to a grayscale value.

image.lab_to_rgb(value: Tuple[int, int, int]) Tuple[int, int, int]

Convert a LAB tuple to an RGB tuple.

image.lab_to_yuv(value: Tuple[int, int, int]) Tuple[int, int, int]

Convert a LAB tuple to a YUV tuple.

image.yuv_to_binary(value: Tuple[int, int, int]) int

Convert a YUV tuple to a binary value.

image.yuv_to_grayscale(value: Tuple[int, int, int]) int

Convert a YUV tuple to a grayscale value.

image.yuv_to_rgb(value: Tuple[int, int, int]) Tuple[int, int, int]

Convert a YUV tuple to an RGB tuple.

image.yuv_to_lab(value: Tuple[int, int, int]) Tuple[int, int, int]

Convert a YUV tuple to a LAB tuple.

Feature descriptors

image.HaarCascade(path: str, stages: int = -1) Cascade

Load a Haar Cascade and return a Cascade handle for use with Image.find_features().

path may be either:

  • the literal string "frontalface" or "eye" to load one of the two cascades baked into firmware ROM, or

  • a filesystem path to a custom .cascade binary file produced by the OpenMV cascade-converter tools.

stages selects how many cascade stages to evaluate at detection time. -1 uses every stage stored in the file. Reducing this value speeds up detection at the cost of more false positives.

image.load_descriptor(path: str) kp_desc | lbp_desc

Load a descriptor from the file at path and return it. The file’s internal type tag selects which descriptor class is reconstructed:

image.save_descriptor(descriptor: kp_desc | lbp_desc, path: str) None

Serialise descriptor (an ORB keypoint or LBP descriptor) to the file at path in the OpenMV descriptor file format. The same file can later be reloaded via image.load_descriptor().

image.match_descriptor(descriptor0, descriptor1, threshold: int = 85, filter_outliers: bool = False) int | kptmatch

Match two descriptors of the same type.

  • For two LBP descriptors – returns an integer Hamming distance between them (lower is a closer match).

  • For two ORB keypoint descriptors – returns a kptmatch describing the matched-keypoint cluster, or None if no match passes threshold.

threshold (0 – 100) sets how strict ORB matching is when accepting a keypoint pair. Lower values tighten matching by rejecting weak nearest-neighbour matches.

filter_outliers enables RANSAC-style outlier rejection across the set of matched keypoints. Use it when you expect a single rigid transform between the two views; disable it when the matched keypoints span multiple objects.

Blob geometry helpers

These helpers take a Blob (as returned by Image.find_blobs()) and compute additional geometric properties on demand. They live at module scope – not on Blob – so the basic find_blobs() path doesn’t pay for them unless you ask.

image.get_solidity(blob: blob) float

Return the solidity (blob.pixels / convex_hull_area) of blob. Float, 0 – 1; 1.0 means the blob fully fills its convex hull.

image.get_convexity(blob: blob) float

Return the convexity (convex_hull_perimeter / blob.perimeter) of blob. Float, 0 – 1; 1.0 is a perfectly convex blob.

image.get_major_axis_line(blob: blob) line

Return a Line along the major axis of blob (the longer of the two principal axes of the minimum-area rotated rectangle).

image.get_minor_axis_line(blob: blob) line

Return a Line along the minor axis of blob (the shorter of the two principal axes of the minimum-area rotated rectangle).

image.get_enclosing_circle(blob: blob) circle

Return a Circle that encloses blob.

image.get_enclosed_ellipse(blob: blob) Tuple[int, int, int, int, int]

Return a 5-tuple (cx, cy, a, b, rotation) describing the ellipse inscribed in the minimum-area rotated rectangle around blob:

  • cx / cy – ellipse centre in pixels (integer).

  • a / b – semi-axis lengths in pixels (integer).

  • rotation – ellipse rotation in degrees (integer).

This is a plain tuple, not an attrtuple, so the fields are accessible only by index.

Constants

Pixel formats

Pass any of the following as the pixformat argument to the Image constructor or to csi.CSI.pixformat().

image.BINARY: int

1-bit-per-pixel bitmap. Smallest format – used internally by thresholding and morphology, rarely captured directly from a sensor.

image.GRAYSCALE: int

8-bit-per-pixel grayscale (one byte per pixel). The fastest format for most computer-vision algorithms (AprilTag, edge detection, optical flow).

image.RGB565: int

16-bit-per-pixel colour packed as 5 bits red / 6 bits green / 5 bits blue. The default colour format.

image.BAYER: int

8-bit-per-pixel raw Bayer data straight off the sensor. Most image processing methods are not available on Bayer images; use this when you want to debayer on demand or store more pixels in less memory.

image.YUV422: int

4:2:2 chroma-subsampled colour, two bytes per pixel, packed as Y1, U, Y2, V per pixel pair. Only some image processing methods work directly on YUV422.

image.JPEG: int

Compressed JPEG buffer. Pixel-level operations require Image.to_grayscale() or Image.to_rgb565() first.

image.PNG: int

Compressed PNG buffer. Pixel-level operations require Image.to_grayscale() or Image.to_rgb565() first.

Colour palettes

Pass any of the following to Image.to_rainbow(), Image.to_ironbow(), Image.draw_image() (color_palette=) or to csi.CSI.color_palette() to colorize a grayscale image.

image.PALETTE_RAINBOW: int

Smooth rainbow colour wheel. The default OpenMV palette for thermal imagery.

image.PALETTE_IRONBOW: int

Non-linear “ironbow” palette that mimics the look of the FLIR Lepton thermal viewfinder.

image.PALETTE_DEPTH: int

Depth-image palette. Only available on builds with depth-sensor support (the ToF pipeline – e.g. OpenMV Cam AE3 or any cam with a ToF Pmod attached).

image.PALETTE_EVT_DARK: int

Palette for visualising GENX320 event-camera frames on a dark background. Pass to csi.CSI.color_palette to have the GENX320 driver emit colorized RGB565 frames in histogram mode, or to Image.draw_image() color_palette= when colorising a grayscale event image.

Only available on builds with GENX320 support (OpenMV Cam AE3 and the GENX320 Pmod).

image.PALETTE_EVT_LIGHT: int

Palette for visualising GENX320 event-camera frames on a light background. Same dispatch and availability as PALETTE_EVT_DARK.

Scaling modes

Pass any of the following as the hint argument to Image.draw_image(), Image.scale(), or similar scaling methods.

image.AREA: int

Area-averaging scaler. Used when downscaling; Nearest-Neighbor is used for upscaling.

image.BILINEAR: int

Bilinear scaler. Subsamples when downscaling.

image.BICUBIC: int

Bicubic scaler. Higher quality than BILINEAR but slower. Subsamples when downscaling.

Drawing / draw_image hints

Bit-OR any of these together and pass as the hint argument of Image.draw_image().

image.VFLIP: int

Vertically flip the source while drawing.

image.HMIRROR: int

Horizontally mirror the source while drawing.

image.TRANSPOSE: int

Transpose (swap x/y) the source while drawing.

image.CENTER: int

Centre the source on the destination. Any explicit x/y offsets then become offsets from the centre instead of from the top-left.

image.EXTRACT_RGB_CHANNEL_FIRST: int

When extracting an RGB channel via Image.draw_image(), extract the channel before scaling. Without this hint, the channel is extracted after scaling.

image.APPLY_COLOR_PALETTE_FIRST: int

When applying a colour palette via Image.draw_image(), apply the palette before scaling. Without this hint, the palette is applied after scaling.

image.SCALE_ASPECT_KEEP: int

Scale the source to fit inside the destination while maintaining aspect ratio (letterboxes when ratios differ).

image.SCALE_ASPECT_EXPAND: int

Scale the source to fill the destination while maintaining aspect ratio (crops when ratios differ).

image.SCALE_ASPECT_IGNORE: int

Scale the source to fill the destination, ignoring aspect ratio.

image.BLACK_BACKGROUND: int

Tell the alpha-blending path that the destination is known-black so it can skip the read-back of the destination pixel. Speeds up alpha effects on freshly-cleared buffers.

image.ROTATE_90: int

Shortcut for VFLIP | TRANSPOSE (rotate 90 degrees clockwise).

image.ROTATE_180: int

Shortcut for HMIRROR | VFLIP (rotate 180 degrees).

image.ROTATE_270: int

Shortcut for HMIRROR | TRANSPOSE (rotate 270 degrees clockwise).

JPEG subsampling

Pass any of the following as the subsampling argument to Image.to_jpeg(), Image.compress(), or Image.save() when writing a JPEG.

image.JPEG_SUBSAMPLING_AUTO: int

Pick chroma subsampling automatically based on the JPEG quality setting.

image.JPEG_SUBSAMPLING_444: int

Force 4:4:4 chroma subsampling (no chroma compression).

image.JPEG_SUBSAMPLING_422: int

Force 4:2:2 chroma subsampling. Recommended when streaming MJPEG to third-party video players that misbehave with 4:2:0.

image.JPEG_SUBSAMPLING_420: int

Force 4:2:0 chroma subsampling.

Template matching

Pass either of the following as the search argument to Image.find_template().

image.SEARCH_EX: int

Exhaustive search – evaluates every position in the ROI. Slowest but guaranteed to find the best match.

image.SEARCH_DS: int

Diamond search – coarse-to-fine search that is much faster than SEARCH_EX but may miss the global optimum on highly self-similar templates.

Edge detection

Pass either of the following as the algorithm argument to Image.find_edges().

image.EDGE_CANNY: int

Canny edge detector – gradient magnitude + non-max suppression + hysteresis. Higher quality, slower.

image.EDGE_SIMPLE: int

Thresholded high-pass-filter edge detector. Faster but produces thicker, noisier edges than EDGE_CANNY.

ORB corner detectors

Pass either of the following as the corner_detector argument to Image.find_keypoints().

image.CORNER_FAST: int

FAST corner detector. Faster than CORNER_AGAST but less accurate.

image.CORNER_AGAST: int

AGAST corner detector. Slower than CORNER_FAST but produces more stable keypoints.

AprilTag families

Bit-OR any combination of the following and pass as the families argument to Image.find_apriltags(). Each family is gated by its own build option in firmware; unsupported families are absent at runtime rather than always-zero.

image.TAG16H5: int

AprilTag 16h5 family (30 unique IDs, 0-bit error correction).

image.TAG25H9: int

AprilTag 25h9 family (35 unique IDs, up to 3-bit error correction).

image.TAG36H10: int

AprilTag 36h10 family (2320 unique IDs, up to 3-bit error correction).

image.TAG36H11: int

AprilTag 36h11 family (587 unique IDs, up to 4-bit error correction). The most common family.

image.TAGCIRCLE21H7: int

AprilTag Circle21h7 family.

image.TAGCIRCLE49H12: int

AprilTag Circle49h12 family.

image.TAGCUSTOM48H12: int

AprilTag Custom48h12 family.

image.TAGSTANDARD41H12: int

AprilTag Standard41h12 family.

image.TAGSTANDARD52H13: int

AprilTag Standard52h13 family.

Barcode symbologies

The values reported in BarCode.type for entries returned by Image.find_barcodes().

image.EAN2: int

EAN-2 supplemental barcode.

image.EAN5: int

EAN-5 supplemental barcode.

image.EAN8: int

EAN-8 barcode.

image.UPCE: int

UPC-E barcode.

image.ISBN10: int

ISBN-10 barcode.

image.UPCA: int

UPC-A barcode.

image.EAN13: int

EAN-13 barcode.

image.ISBN13: int

ISBN-13 barcode.

image.I25: int

Interleaved 2-of-5 barcode.

image.DATABAR: int

GS1 DataBar barcode.

image.DATABAR_EXP: int

GS1 DataBar Expanded barcode.

image.CODABAR: int

Codabar barcode.

image.CODE39: int

Code 39 barcode.

image.PDF417: int

PDF417 2D stacked barcode.

image.CODE93: int

Code 93 barcode.

image.CODE128: int

Code 128 barcode.