`image` — machine vision¶

The image module is the heart of the OpenMV machine-vision stack. It exposes the Image class – the in-memory pixel buffer that every drawing, filtering, transformation and feature- extraction routine operates on – together with the supporting result objects returned by those routines (Blob, Line, Circle, Rect, QRCode, AprilTag, DataMatrix, BarCode, …) and the helper classes used to configure them (Threshold, Histogram, Statistics, HaarCascade, Similarity, Percentile, Displacement, ImageIO).

Acquiring an Image¶

There are four ways to get an Image into RAM:

Live capture from the camera sensor. Call csi.CSI.snapshot() to capture the next frame straight into the frame buffer; the returned Image references that buffer.
From a file. Pass a path to the Image constructor (image.Image("/sd/photo.jpg")); supported on-disk formats are BMP, PPM/PGM, JPEG, PNG and the OpenMV ImageIO recording format.
From an ndarray. Pass a float32 (h, w) or (h, w, 3) ndarray to the Image constructor. The pixels are scaled from 0.0 -- 255.0 into a GRAYSCALE or RGB565 image respectively. Use this to bring tensor output from ml (or any ulab pipeline) back into a drawable image.
Empty buffer. Construct an Image with a given size and pixel format (image.Image(320, 240, image.RGB565)) to draw into from scratch, or to use as a scratch surface for image arithmetic.

Pixel formats¶

Every Image has one of the following pixel formats; the choice trades off memory, processing cost and what algorithms can run on it. Use BINARY, GRAYSCALE, RGB565, BAYER, YUV422, JPEG or PNG as the pixformat argument when constructing an image or configuring the camera sensor:

BINARY (1 bpp) – one bit per pixel. The smallest format; used internally by thresholding and morphology routines but rarely captured directly from the sensor.
GRAYSCALE (8 bpp) – one byte per pixel (the Y channel of YUV422). Fastest format for most computer-vision algorithms (AprilTag, edge detection, optical flow).
RGB565 (16 bpp) – two bytes per pixel, 5-bit red / 6-bit green / 5-bit blue. The default colour format.
BAYER (8 bpp) – raw Bayer-pattern colour data straight off the sensor. Useful for custom de-mosaicing or for storing more pixels in less memory before debayering on demand.
YUV422 (16 bpp) – 4:2:2 chroma-subsampled colour, two bytes per pixel. Useful when you want chroma-specific algorithms without paying the full RGB cost.
JPEG / PNG – compressed buffers. Best for storage and network transmission. Pixel-level operations require Image.to_grayscale() or Image.to_rgb565() first.

Working with results¶

The detection / feature-extraction methods on Image return objects you can iterate over and combine – a Image.find_blobs() call returns a list of Blob, a Image.find_apriltags() call returns a list of AprilTag, etc. Each result class exposes the geometric properties of the detection (centroid, bounding box, area, code value, etc.) so you can act on them directly or pass them back into drawing methods (Image.draw_rectangle(), Image.draw_string(), …).

Color-space helpers¶

The module also exposes small pure functions for converting individual pixel values between the binary / grayscale / RGB / LAB / YUV colour spaces. These are useful when you need to convert threshold values or palette entries in Python before passing them into image operations – for full-image conversion use the Image to_* methods, which are much faster than calling these helpers in a loop.

Classes¶

Functions¶

Color-space conversion helpers¶

Each of the X_to_Y functions below performs a single pixel-value conversion. They all take/return values in the canonical OpenMV ranges:

binary – int 0 – 1.
grayscale – int 0 – 255.
RGB – (r, g, b) tuple of 8-bit integers (each 0 – 255).
LAB – (l, a, b) tuple with L in 0 – 100 and A/B in -128 – 127.
YUV – (y, u, v) tuple with Y in 0 – 255 and U/V in -128 – 127.

For full-image conversion use the Image to_* methods, which are much faster than calling these helpers in a loop.

image.binary_to_grayscale(value: int) → int¶: Convert a binary value to a grayscale value.

image.binary_to_rgb(value: int) → Tuple[int, int, int]¶: Convert a binary value to an RGB tuple.

image.binary_to_lab(value: int) → Tuple[int, int, int]¶: Convert a binary value to a LAB tuple.

image.binary_to_yuv(value: int) → Tuple[int, int, int]¶: Convert a binary value to a YUV tuple.

image.grayscale_to_binary(value: int) → int¶: Convert a grayscale value to a binary value.

image.grayscale_to_rgb(value: int) → Tuple[int, int, int]¶: Convert a grayscale value to an RGB tuple.

image.grayscale_to_lab(value: int) → Tuple[int, int, int]¶: Convert a grayscale value to a LAB tuple.

image.grayscale_to_yuv(value: int) → Tuple[int, int, int]¶: Convert a grayscale value to a YUV tuple.

image.rgb_to_binary(value: Tuple[int, int, int]) → int¶: Convert an RGB tuple to a binary value.

image.rgb_to_grayscale(value: Tuple[int, int, int]) → int¶: Convert an RGB tuple to a grayscale value.

image.rgb_to_lab(value: Tuple[int, int, int]) → Tuple[int, int, int]¶: Convert an RGB tuple to a LAB tuple.

image.rgb_to_yuv(value: Tuple[int, int, int]) → Tuple[int, int, int]¶: Convert an RGB tuple to a YUV tuple.

image.lab_to_binary(value: Tuple[int, int, int]) → int¶: Convert a LAB tuple to a binary value.

image.lab_to_grayscale(value: Tuple[int, int, int]) → int¶: Convert a LAB tuple to a grayscale value.

image.lab_to_rgb(value: Tuple[int, int, int]) → Tuple[int, int, int]¶: Convert a LAB tuple to an RGB tuple.

image.lab_to_yuv(value: Tuple[int, int, int]) → Tuple[int, int, int]¶: Convert a LAB tuple to a YUV tuple.

image.yuv_to_binary(value: Tuple[int, int, int]) → int¶: Convert a YUV tuple to a binary value.

image.yuv_to_grayscale(value: Tuple[int, int, int]) → int¶: Convert a YUV tuple to a grayscale value.

image.yuv_to_rgb(value: Tuple[int, int, int]) → Tuple[int, int, int]¶: Convert a YUV tuple to an RGB tuple.

image.yuv_to_lab(value: Tuple[int, int, int]) → Tuple[int, int, int]¶: Convert a YUV tuple to a LAB tuple.

Feature descriptors¶

image.HaarCascade(path: str, stages: int = -1) → Cascade¶

Load a Haar Cascade and return a Cascade handle for use with Image.find_features().

path may be either:

the literal string "frontalface" or "eye" to load one of the two cascades baked into firmware ROM, or

a filesystem path to a custom .cascade binary file produced by the OpenMV cascade-converter tools.

stages selects how many cascade stages to evaluate at detection time. -1 uses every stage stored in the file. Reducing this value speeds up detection at the cost of more false positives.

image.load_descriptor(path: str) → kp_desc | lbp_desc¶

Load a descriptor from the file at path and return it. The file’s internal type tag selects which descriptor class is reconstructed:

ORB keypoint descriptor – saved by Image.find_keypoints() followed by image.save_descriptor().

LBP descriptor – saved by Image.find_lbp() followed by image.save_descriptor().

image.save_descriptor(descriptor: kp_desc | lbp_desc, path: str) → None¶: Serialise descriptor (an ORB keypoint or LBP descriptor) to the file at path in the OpenMV descriptor file format. The same file can later be reloaded via image.load_descriptor().

image.match_descriptor(descriptor0, descriptor1, threshold: int = 85, filter_outliers: bool = False) → int | kptmatch¶

Match two descriptors of the same type.

For two LBP descriptors – returns an integer Hamming distance between them (lower is a closer match).
For two ORB keypoint descriptors – returns a kptmatch describing the matched-keypoint cluster, or None if no match passes threshold.

threshold (0 – 100) sets how strict ORB matching is when accepting a keypoint pair. Lower values tighten matching by rejecting weak nearest-neighbour matches.

filter_outliers enables RANSAC-style outlier rejection across the set of matched keypoints. Use it when you expect a single rigid transform between the two views; disable it when the matched keypoints span multiple objects.

Blob geometry helpers¶

These helpers take a Blob (as returned by Image.find_blobs()) and compute additional geometric properties on demand. They live at module scope – not on Blob – so the basic find_blobs() path doesn’t pay for them unless you ask.

image.get_solidity(blob: blob) → float¶: Return the solidity (blob.pixels / convex_hull_area) of blob. Float, 0 – 1; 1.0 means the blob fully fills its convex hull.

image.get_convexity(blob: blob) → float¶: Return the convexity (convex_hull_perimeter / blob.perimeter) of blob. Float, 0 – 1; 1.0 is a perfectly convex blob.

image.get_major_axis_line(blob: blob) → line¶: Return a Line along the major axis of blob (the longer of the two principal axes of the minimum-area rotated rectangle).

image.get_minor_axis_line(blob: blob) → line¶: Return a Line along the minor axis of blob (the shorter of the two principal axes of the minimum-area rotated rectangle).

image.get_enclosing_circle(blob: blob) → circle¶: Return a Circle that encloses blob.

image.get_enclosed_ellipse(blob: blob) → Tuple[int, int, int, int, int]¶

Return a 5-tuple (cx, cy, a, b, rotation) describing the ellipse inscribed in the minimum-area rotated rectangle around blob:

cx / cy – ellipse centre in pixels (integer).

a / b – semi-axis lengths in pixels (integer).

rotation – ellipse rotation in degrees (integer).

This is a plain tuple, not an attrtuple, so the fields are accessible only by index.

Constants¶

Pixel formats¶

Pass any of the following as the pixformat argument to the Image constructor or to csi.CSI.pixformat().

image.BINARY: int¶: 1-bit-per-pixel bitmap. Smallest format – used internally by thresholding and morphology, rarely captured directly from a sensor.

image.GRAYSCALE: int¶: 8-bit-per-pixel grayscale (one byte per pixel). The fastest format for most computer-vision algorithms (AprilTag, edge detection, optical flow).

image.RGB565: int¶: 16-bit-per-pixel colour packed as 5 bits red / 6 bits green / 5 bits blue. The default colour format.

image.BAYER: int¶: 8-bit-per-pixel raw Bayer data straight off the sensor. Most image processing methods are not available on Bayer images; use this when you want to debayer on demand or store more pixels in less memory.

image.YUV422: int¶: 4:2:2 chroma-subsampled colour, two bytes per pixel, packed as Y1, U, Y2, V per pixel pair. Only some image processing methods work directly on YUV422.

image.JPEG: int¶: Compressed JPEG buffer. Pixel-level operations require Image.to_grayscale() or Image.to_rgb565() first.

image.PNG: int¶: Compressed PNG buffer. Pixel-level operations require Image.to_grayscale() or Image.to_rgb565() first.

Colour palettes¶

Pass any of the following to Image.to_rainbow(), Image.to_ironbow(), Image.draw_image() (color_palette=) or to csi.CSI.color_palette() to colorize a grayscale image.

image.PALETTE_RAINBOW: int¶: Smooth rainbow colour wheel. The default OpenMV palette for thermal imagery.

image.PALETTE_IRONBOW: int¶: Non-linear “ironbow” palette that mimics the look of the FLIR Lepton thermal viewfinder.

image.PALETTE_DEPTH: int¶: Depth-image palette. Only available on builds with depth-sensor support (the ToF pipeline – e.g. OpenMV Cam AE3 or any cam with a ToF Pmod attached).

image.PALETTE_EVT_DARK: int¶

Palette for visualising GENX320 event-camera frames on a dark background. Pass to csi.CSI.color_palette to have the GENX320 driver emit colorized RGB565 frames in histogram mode, or to Image.draw_image() color_palette= when colorising a grayscale event image.

Only available on builds with GENX320 support (OpenMV Cam AE3 and the GENX320 Pmod).

image.PALETTE_EVT_LIGHT: int¶: Palette for visualising GENX320 event-camera frames on a light background. Same dispatch and availability as PALETTE_EVT_DARK.

Scaling modes¶

Pass any of the following as the hint argument to Image.draw_image(), Image.scale(), or similar scaling methods.

image.AREA: int¶: Area-averaging scaler. Used when downscaling; Nearest-Neighbor is used for upscaling.

image.BILINEAR: int¶: Bilinear scaler. Subsamples when downscaling.

image.BICUBIC: int¶: Bicubic scaler. Higher quality than BILINEAR but slower. Subsamples when downscaling.

Drawing / draw_image hints¶

Bit-OR any of these together and pass as the hint argument of Image.draw_image().

image.VFLIP: int¶: Vertically flip the source while drawing.

image.HMIRROR: int¶: Horizontally mirror the source while drawing.

image.TRANSPOSE: int¶: Transpose (swap x/y) the source while drawing.

image.CENTER: int¶: Centre the source on the destination. Any explicit x/y offsets then become offsets from the centre instead of from the top-left.

image.EXTRACT_RGB_CHANNEL_FIRST: int¶: When extracting an RGB channel via Image.draw_image(), extract the channel before scaling. Without this hint, the channel is extracted after scaling.

image.APPLY_COLOR_PALETTE_FIRST: int¶: When applying a colour palette via Image.draw_image(), apply the palette before scaling. Without this hint, the palette is applied after scaling.

image.SCALE_ASPECT_KEEP: int¶: Scale the source to fit inside the destination while maintaining aspect ratio (letterboxes when ratios differ).

image.SCALE_ASPECT_EXPAND: int¶: Scale the source to fill the destination while maintaining aspect ratio (crops when ratios differ).

image.SCALE_ASPECT_IGNORE: int¶: Scale the source to fill the destination, ignoring aspect ratio.

image.BLACK_BACKGROUND: int¶: Tell the alpha-blending path that the destination is known-black so it can skip the read-back of the destination pixel. Speeds up alpha effects on freshly-cleared buffers.

image.ROTATE_90: int¶: Shortcut for VFLIP | TRANSPOSE (rotate 90 degrees clockwise).

image.ROTATE_180: int¶: Shortcut for HMIRROR | VFLIP (rotate 180 degrees).

image.ROTATE_270: int¶: Shortcut for HMIRROR | TRANSPOSE (rotate 270 degrees clockwise).

JPEG subsampling¶

Pass any of the following as the subsampling argument to Image.to_jpeg(), Image.compress(), or Image.save() when writing a JPEG.

image.JPEG_SUBSAMPLING_AUTO: int¶: Pick chroma subsampling automatically based on the JPEG quality setting.

image.JPEG_SUBSAMPLING_444: int¶: Force 4:4:4 chroma subsampling (no chroma compression).

image.JPEG_SUBSAMPLING_422: int¶: Force 4:2:2 chroma subsampling. Recommended when streaming MJPEG to third-party video players that misbehave with 4:2:0.

image.JPEG_SUBSAMPLING_420: int¶: Force 4:2:0 chroma subsampling.

Template matching¶

Pass either of the following as the search argument to Image.find_template().

image.SEARCH_EX: int¶: Exhaustive search – evaluates every position in the ROI. Slowest but guaranteed to find the best match.

image.SEARCH_DS: int¶: Diamond search – coarse-to-fine search that is much faster than SEARCH_EX but may miss the global optimum on highly self-similar templates.

Edge detection¶

Pass either of the following as the algorithm argument to Image.find_edges().

image.EDGE_CANNY: int¶: Canny edge detector – gradient magnitude + non-max suppression + hysteresis. Higher quality, slower.

image.EDGE_SIMPLE: int¶: Thresholded high-pass-filter edge detector. Faster but produces thicker, noisier edges than EDGE_CANNY.

ORB corner detectors¶

Pass either of the following as the corner_detector argument to Image.find_keypoints().

image.CORNER_FAST: int¶: FAST corner detector. Faster than CORNER_AGAST but less accurate.

image.CORNER_AGAST: int¶: AGAST corner detector. Slower than CORNER_FAST but produces more stable keypoints.

AprilTag families¶

Bit-OR any combination of the following and pass as the families argument to Image.find_apriltags(). Each family is gated by its own build option in firmware; unsupported families are absent at runtime rather than always-zero.

image.TAG16H5: int¶: AprilTag 16h5 family (30 unique IDs, 0-bit error correction).

image.TAG25H9: int¶: AprilTag 25h9 family (35 unique IDs, up to 3-bit error correction).

image.TAG36H10: int¶: AprilTag 36h10 family (2320 unique IDs, up to 3-bit error correction).

image.TAG36H11: int¶: AprilTag 36h11 family (587 unique IDs, up to 4-bit error correction). The most common family.

image.TAGCIRCLE21H7: int¶: AprilTag Circle21h7 family.

image.TAGCIRCLE49H12: int¶: AprilTag Circle49h12 family.

image.TAGCUSTOM48H12: int¶: AprilTag Custom48h12 family.

image.TAGSTANDARD41H12: int¶: AprilTag Standard41h12 family.

image.TAGSTANDARD52H13: int¶: AprilTag Standard52h13 family.

Barcode symbologies¶

The values reported in BarCode.type for entries returned by Image.find_barcodes().

image.EAN2: int¶: EAN-2 supplemental barcode.

image.EAN5: int¶: EAN-5 supplemental barcode.

image.EAN8: int¶: EAN-8 barcode.

image.UPCE: int¶: UPC-E barcode.

image.ISBN10: int¶: ISBN-10 barcode.

image.UPCA: int¶: UPC-A barcode.

image.EAN13: int¶: EAN-13 barcode.

image.ISBN13: int¶: ISBN-13 barcode.

image.I25: int¶: Interleaved 2-of-5 barcode.

image.DATABAR: int¶: GS1 DataBar barcode.

image.DATABAR_EXP: int¶: GS1 DataBar Expanded barcode.

image.CODABAR: int¶: Codabar barcode.

image.CODE39: int¶: Code 39 barcode.

image.PDF417: int¶: PDF417 2D stacked barcode.

image.CODE93: int¶: Code 93 barcode.

image.CODE128: int¶: Code 128 barcode.

image — machine vision¶

Acquiring an Image¶

Pixel formats¶

Working with results¶

Color-space helpers¶

Classes¶

Functions¶

Color-space conversion helpers¶

Feature descriptors¶

Blob geometry helpers¶

Constants¶

Pixel formats¶

Colour palettes¶

Scaling modes¶

Drawing / draw_image hints¶

JPEG subsampling¶

Template matching¶

Edge detection¶

ORB corner detectors¶

AprilTag families¶

Barcode symbologies¶

`image` — machine vision¶