image — machine vision¶
The image module is the heart of the OpenMV machine-vision
stack. It exposes the Image class – the in-memory pixel
buffer that every drawing, filtering, transformation and feature-
extraction routine operates on – together with the supporting
result objects returned by those routines (Blob,
Line, Circle, Rect,
QRCode, AprilTag,
DataMatrix, BarCode, …)
and the helper classes used to configure them
(Threshold, Histogram,
Statistics, HaarCascade,
Similarity, Percentile,
Displacement, ImageIO).
Acquiring an Image¶
There are four ways to get an Image into RAM:
Live capture from the camera sensor. Call
csi.CSI.snapshot()to capture the next frame straight into the frame buffer; the returnedImagereferences that buffer.From a file. Pass a path to the
Imageconstructor (image.Image("/sd/photo.jpg")); supported on-disk formats are BMP, PPM/PGM, JPEG, PNG and the OpenMVImageIOrecording format.From an ndarray. Pass a float32
(h, w)or(h, w, 3)ndarrayto theImageconstructor. The pixels are scaled from0.0 -- 255.0into a GRAYSCALE or RGB565 image respectively. Use this to bring tensor output fromml(or anyulabpipeline) back into a drawable image.Empty buffer. Construct an
Imagewith a given size and pixel format (image.Image(320, 240, image.RGB565)) to draw into from scratch, or to use as a scratch surface for image arithmetic.
Pixel formats¶
Every Image has one of the following pixel formats; the
choice trades off memory, processing cost and what algorithms can
run on it. Use BINARY, GRAYSCALE, RGB565,
BAYER, YUV422, JPEG or PNG as the
pixformat argument when constructing an image or configuring
the camera sensor:
BINARY (1 bpp) – one bit per pixel. The smallest format; used internally by thresholding and morphology routines but rarely captured directly from the sensor.
GRAYSCALE (8 bpp) – one byte per pixel (the Y channel of YUV422). Fastest format for most computer-vision algorithms (AprilTag, edge detection, optical flow).
RGB565 (16 bpp) – two bytes per pixel, 5-bit red / 6-bit green / 5-bit blue. The default colour format.
BAYER (8 bpp) – raw Bayer-pattern colour data straight off the sensor. Useful for custom de-mosaicing or for storing more pixels in less memory before debayering on demand.
YUV422 (16 bpp) – 4:2:2 chroma-subsampled colour, two bytes per pixel. Useful when you want chroma-specific algorithms without paying the full RGB cost.
JPEG / PNG – compressed buffers. Best for storage and network transmission. Pixel-level operations require
Image.to_grayscale()orImage.to_rgb565()first.
Working with results¶
The detection / feature-extraction methods on Image
return objects you can iterate over and combine – a
Image.find_blobs() call returns a list of
Blob, a Image.find_apriltags() call returns
a list of AprilTag, etc. Each result class
exposes the geometric properties of the detection (centroid,
bounding box, area, code value, etc.) so you can act on them
directly or pass them back into drawing methods
(Image.draw_rectangle(), Image.draw_string(), …).
Color-space helpers¶
The module also exposes small pure functions for converting
individual pixel values between the binary / grayscale / RGB / LAB
/ YUV colour spaces. These are useful when you need to convert
threshold values or palette entries in Python before passing them
into image operations – for full-image conversion use the
Image to_* methods, which are much faster than calling
these helpers in a loop.
Classes¶
- class Image – Image object
- class ImageIO – ImageIO object
- class HaarCascade – Feature Descriptor
- class Similarity – Similarity Object
- class Histogram – Histogram Object
- class Percentile – Percentile Object
- class Threshold – Threshold Object
- class Statistics – Statistics Object
- class Blob – Blob object
- class Line – Line object
- class Circle – Circle object
- class Rect – Rectangle Object
- class QRCode – QRCode object
- class AprilTag – AprilTag object
- class DataMatrix – DataMatrix object
- class BarCode – BarCode object
- class Displacement – Displacement object
- class kptmatch – Keypoint match object
Functions¶
Color-space conversion helpers¶
Each of the X_to_Y functions below performs a single pixel-value
conversion. They all take/return values in the canonical OpenMV ranges:
binary –
int0 – 1.grayscale –
int0 – 255.RGB –
(r, g, b)tuple of 8-bit integers (each 0 – 255).LAB –
(l, a, b)tuple withLin 0 – 100 andA/Bin -128 – 127.YUV –
(y, u, v)tuple withYin 0 – 255 andU/Vin -128 – 127.
For full-image conversion use the Image to_* methods, which
are much faster than calling these helpers in a loop.
- image.rgb_to_lab(value: Tuple[int, int, int]) Tuple[int, int, int]¶
Convert an RGB tuple to a LAB tuple.
- image.rgb_to_yuv(value: Tuple[int, int, int]) Tuple[int, int, int]¶
Convert an RGB tuple to a YUV tuple.
- image.lab_to_rgb(value: Tuple[int, int, int]) Tuple[int, int, int]¶
Convert a LAB tuple to an RGB tuple.
- image.lab_to_yuv(value: Tuple[int, int, int]) Tuple[int, int, int]¶
Convert a LAB tuple to a YUV tuple.
Feature descriptors¶
- image.HaarCascade(path: str, stages: int = -1) Cascade¶
Load a Haar Cascade and return a
Cascadehandle for use withImage.find_features().pathmay be either:the literal string
"frontalface"or"eye"to load one of the two cascades baked into firmware ROM, ora filesystem path to a custom
.cascadebinary file produced by the OpenMV cascade-converter tools.
stagesselects how many cascade stages to evaluate at detection time.-1uses every stage stored in the file. Reducing this value speeds up detection at the cost of more false positives.
- image.load_descriptor(path: str) kp_desc | lbp_desc¶
Load a descriptor from the file at
pathand return it. The file’s internal type tag selects which descriptor class is reconstructed:ORB keypoint descriptor – saved by
Image.find_keypoints()followed byimage.save_descriptor().LBP descriptor – saved by
Image.find_lbp()followed byimage.save_descriptor().
- image.save_descriptor(descriptor: kp_desc | lbp_desc, path: str) None¶
Serialise
descriptor(an ORB keypoint or LBP descriptor) to the file atpathin the OpenMV descriptor file format. The same file can later be reloaded viaimage.load_descriptor().
- image.match_descriptor(descriptor0, descriptor1, threshold: int = 85, filter_outliers: bool = False) int | kptmatch¶
Match two descriptors of the same type.
For two LBP descriptors – returns an integer Hamming distance between them (lower is a closer match).
For two ORB keypoint descriptors – returns a
kptmatchdescribing the matched-keypoint cluster, orNoneif no match passesthreshold.
threshold(0 – 100) sets how strict ORB matching is when accepting a keypoint pair. Lower values tighten matching by rejecting weak nearest-neighbour matches.filter_outliersenables RANSAC-style outlier rejection across the set of matched keypoints. Use it when you expect a single rigid transform between the two views; disable it when the matched keypoints span multiple objects.
Blob geometry helpers¶
These helpers take a Blob (as returned by
Image.find_blobs()) and compute additional geometric properties on
demand. They live at module scope – not on Blob – so
the basic find_blobs() path doesn’t pay for them unless you ask.
- image.get_solidity(blob: blob) float¶
Return the solidity (
blob.pixels / convex_hull_area) ofblob. Float, 0 – 1; 1.0 means the blob fully fills its convex hull.
- image.get_convexity(blob: blob) float¶
Return the convexity (
convex_hull_perimeter / blob.perimeter) ofblob. Float, 0 – 1; 1.0 is a perfectly convex blob.
- image.get_major_axis_line(blob: blob) line¶
Return a
Linealong the major axis ofblob(the longer of the two principal axes of the minimum-area rotated rectangle).
- image.get_minor_axis_line(blob: blob) line¶
Return a
Linealong the minor axis ofblob(the shorter of the two principal axes of the minimum-area rotated rectangle).
- image.get_enclosed_ellipse(blob: blob) Tuple[int, int, int, int, int]¶
Return a 5-tuple
(cx, cy, a, b, rotation)describing the ellipse inscribed in the minimum-area rotated rectangle aroundblob:cx/cy– ellipse centre in pixels (integer).a/b– semi-axis lengths in pixels (integer).rotation– ellipse rotation in degrees (integer).
This is a plain tuple, not an attrtuple, so the fields are accessible only by index.
Constants¶
Pixel formats¶
Pass any of the following as the pixformat argument to the
Image constructor or to csi.CSI.pixformat().
- image.BINARY: int¶
1-bit-per-pixel bitmap. Smallest format – used internally by thresholding and morphology, rarely captured directly from a sensor.
- image.GRAYSCALE: int¶
8-bit-per-pixel grayscale (one byte per pixel). The fastest format for most computer-vision algorithms (AprilTag, edge detection, optical flow).
- image.RGB565: int¶
16-bit-per-pixel colour packed as 5 bits red / 6 bits green / 5 bits blue. The default colour format.
- image.BAYER: int¶
8-bit-per-pixel raw Bayer data straight off the sensor. Most image processing methods are not available on Bayer images; use this when you want to debayer on demand or store more pixels in less memory.
- image.YUV422: int¶
4:2:2 chroma-subsampled colour, two bytes per pixel, packed as
Y1, U, Y2, Vper pixel pair. Only some image processing methods work directly on YUV422.
- image.JPEG: int¶
Compressed JPEG buffer. Pixel-level operations require
Image.to_grayscale()orImage.to_rgb565()first.
- image.PNG: int¶
Compressed PNG buffer. Pixel-level operations require
Image.to_grayscale()orImage.to_rgb565()first.
Colour palettes¶
Pass any of the following to Image.to_rainbow(),
Image.to_ironbow(), Image.draw_image() (color_palette=)
or to csi.CSI.color_palette() to colorize a grayscale image.
- image.PALETTE_RAINBOW: int¶
Smooth rainbow colour wheel. The default OpenMV palette for thermal imagery.
- image.PALETTE_IRONBOW: int¶
Non-linear “ironbow” palette that mimics the look of the FLIR Lepton thermal viewfinder.
- image.PALETTE_DEPTH: int¶
Depth-image palette. Only available on builds with depth-sensor support (the ToF pipeline – e.g. OpenMV Cam AE3 or any cam with a ToF Pmod attached).
- image.PALETTE_EVT_DARK: int¶
Palette for visualising GENX320 event-camera frames on a dark background. Pass to
csi.CSI.color_paletteto have the GENX320 driver emit colorized RGB565 frames in histogram mode, or toImage.draw_image()color_palette=when colorising a grayscale event image.Only available on builds with GENX320 support (OpenMV Cam AE3 and the GENX320 Pmod).
- image.PALETTE_EVT_LIGHT: int¶
Palette for visualising GENX320 event-camera frames on a light background. Same dispatch and availability as
PALETTE_EVT_DARK.
Scaling modes¶
Pass any of the following as the hint argument to
Image.draw_image(), Image.scale(), or similar scaling
methods.
Drawing / draw_image hints¶
Bit-OR any of these together and pass as the hint argument of
Image.draw_image().
- image.CENTER: int¶
Centre the source on the destination. Any explicit x/y offsets then become offsets from the centre instead of from the top-left.
- image.EXTRACT_RGB_CHANNEL_FIRST: int¶
When extracting an RGB channel via
Image.draw_image(), extract the channel before scaling. Without this hint, the channel is extracted after scaling.
- image.APPLY_COLOR_PALETTE_FIRST: int¶
When applying a colour palette via
Image.draw_image(), apply the palette before scaling. Without this hint, the palette is applied after scaling.
- image.SCALE_ASPECT_KEEP: int¶
Scale the source to fit inside the destination while maintaining aspect ratio (letterboxes when ratios differ).
- image.SCALE_ASPECT_EXPAND: int¶
Scale the source to fill the destination while maintaining aspect ratio (crops when ratios differ).
JPEG subsampling¶
Pass any of the following as the subsampling argument to
Image.to_jpeg(), Image.compress(), or Image.save()
when writing a JPEG.
- image.JPEG_SUBSAMPLING_AUTO: int¶
Pick chroma subsampling automatically based on the JPEG quality setting.
Template matching¶
Pass either of the following as the search argument to
Image.find_template().
Edge detection¶
Pass either of the following as the algorithm argument to
Image.find_edges().
- image.EDGE_CANNY: int¶
Canny edge detector – gradient magnitude + non-max suppression + hysteresis. Higher quality, slower.
- image.EDGE_SIMPLE: int¶
Thresholded high-pass-filter edge detector. Faster but produces thicker, noisier edges than
EDGE_CANNY.
ORB corner detectors¶
Pass either of the following as the corner_detector argument to
Image.find_keypoints().
- image.CORNER_FAST: int¶
FAST corner detector. Faster than
CORNER_AGASTbut less accurate.
- image.CORNER_AGAST: int¶
AGAST corner detector. Slower than
CORNER_FASTbut produces more stable keypoints.
AprilTag families¶
Bit-OR any combination of the following and pass as the families
argument to Image.find_apriltags(). Each family is gated by its
own build option in firmware; unsupported families are absent at
runtime rather than always-zero.
Barcode symbologies¶
The values reported in BarCode.type for entries
returned by Image.find_barcodes().