class Image – Image object

The image object is the basic object for machine vision operations.

Hint flags

Many Image methods accept a hint argument which is a logical OR of the following flags:

image.AREA: Use area scaling when downscaling versus the default of nearest neighbor.

image.BILINEAR: Use bilinear scaling versus the default of nearest neighbor scaling.

image.BICUBIC: Use bicubic scaling versus the default of nearest neighbor scaling.

image.CENTER: Center the image being drawn on the display. This is applied after scaling.

image.HMIRROR: Horizontally mirror the image.

image.VFLIP: Vertically flip the image.

image.TRANSPOSE: Transpose the image (swap x/y).

image.EXTRACT_RGB_CHANNEL_FIRST: Do rgb_channel extraction before scaling.

image.APPLY_COLOR_PALETTE_FIRST: Apply color palette before scaling.

image.SCALE_ASPECT_KEEP: Scale the image being drawn to fit inside the display.

image.SCALE_ASPECT_EXPAND: Scale the image being drawn to fill the display (results in cropping)

image.SCALE_ASPECT_IGNORE: Scale the image being drawn to fill the display (results in stretching).

image.ROTATE_90: Rotate the image by 90 degrees (this is just VFLIP | TRANSPOSE).

image.ROTATE_180: Rotate the image by 180 degrees (this is just HMIRROR | VFLIP).

image.ROTATE_270: Rotate the image by 270 degrees (this is just HMIRROR | TRANSPOSE).

image.BLACK_BACKGROUND: Assume the background image being drawn on is black speeding up blending. Only supported by Image.draw_image() and Image.get_similarity().

If arg is a string then this creates a new image object from a file at arg path. Supports loading bmp/pgm/ppm/jpg/jpeg/png image files from disk. If copy_to_fb is true the image is copied to the frame buffer verus being allocated on the heap.

If arg is an ndarray then this creates a new image object from the ndarray. ndarray objects with a shape of (w, h) are treated as grayscale images, (w, h, 3) are treated as RGB565 images. Only float32 point ndarrays are supported at this time. When creating an image this way if you pass a buffer argument it will be used to store the image data versus allocating space on the heap. If copy_to_fb is true the image is copied to the frame buffer verus being allocated on the heap or using the buffer.

If arg is an int it is then considered the width of a new image and a height value and a format value must follow to create a new blank image object. format can be be any image pixformat value like image.GRAYSCALE. The image will be initialized to all zeros. Note that a buffer value is expected for compressed image formats. buffer is considered as the source of image data for creating images this way. If used with copy_to_fb the data from buffer is copied to the frame buffer. If you’d like to create a JPEG image from a JPEG bytes() or bytearray() object you can pass the width, height, image.JPEG for the JPEG along with setting buffer to the JPEG byte stream to create a JPEG image.

Images support “[]” notation. Do image[index] = 8/16-bit value to assign an image pixel or image[index] to get an image pixel which will be either an 8-bit value for grayscale/bayer images of a 16-bit value for RGB565/YUV images. Binary images return a 1-bit value.

For JPEG images the “[]” allows you to access the compressed JPEG image blob as a byte-array. Reading and writing to the data array is opaque however as JPEG images are compressed byte streams.

Images also support read buffer operations. You can pass images to all sorts of MicroPython functions like as if the image were a byte-array object. In particular, if you’d like to transmit an image you can just pass it to the UART/SPI/I2C write functions to be transmitted automatically.

Basic Methods

width() → int: Returns the image width in pixels.

height() → int: Returns the image height in pixels.

format() → int: Returns image.GRAYSCALE for grayscale images, image.RGB565 for RGB565 images, image.BAYER for bayer pattern images, and image.JPEG for JPEG images.

size() → int: Returns the image size in bytes.

bytearray() → bytearray: Returns a bytearray object that points to the image data for byte-level read/write access.

Note

Image objects are automatically cast as bytes objects when passed to MicroPython driver that requires a bytes like object. This is read-only access. Call bytearray() to get read/write access.

get_pixel(x: int, y: int, rgbtuple: bool | None = None) → int | Tuple[int, int, int]

For grayscale images: Returns the grayscale pixel value at location (x, y). For RGB565 images: Returns the RGB888 pixel tuple (r, g, b) at location (x, y). For bayer pattern images: Returns the the pixel value at the location (x, y).

Returns None if x or y is outside of the image.

x and y may either be passed independently or as a tuple.

rgbtuple if True causes this method to return an RGB888 tuple. Otherwise, this method returns the integer value of the underlying pixel. I.e. for RGB565 images this method returns a RGB565 value. Defaults to True for RGB565 images and False otherwise.

Not supported on compressed images.

Note

Image.get_pixel() and Image.set_pixel() are the only methods that allow you to manipulate bayer pattern images. Bayer pattern images are literal images where pixels in the image are R/G/R/G/etc. for even rows and G/B/G/B/etc. for odd rows. Each pixel is 8-bits. If you call this method with rgbtuple set then Image.get_pixel() will debayer the source image at that pixel location and return a valid RGB888 tuple for the pixel location.

set_pixel(x: int, y: int, pixel: int | Tuple[int, int, int]) → Image

For grayscale images: Sets the pixel at location (x, y) to the grayscale value pixel. For RGB565 images: Sets the pixel at location (x, y) to the RGB888 tuple (r, g, b) pixel. For bayer pattern images: Sets the pixel value at the location (x, y) to the value pixel.

Returns the image object so you can call another method using . notation.

x and y may either be passed independently or as a tuple.

pixel may either be an RGB888 tuple (r, g, b) or the underlying pixel value (i.e. a RGB565 value for RGB565 images or an 8-bit value for grayscale images.

Not supported on compressed images.

Note

Image.get_pixel() and Image.set_pixel() are the only methods that allow you to manipulate bayer pattern images. Bayer pattern images are literal images where pixels in the image are R/G/R/G/etc. for even rows and G/B/G/B/etc. for odd rows. Each pixel is 8-bits. If you call this method with an RGB888 tuple the grayscale value of that RGB888 tuple is extracted and set to the pixel location.

Conversion Methods

to_ndarray(dtype: str, buffer: bytes | bytearray | memoryview | None = None) → ndarray

Returns a ndarray object created from the image. This only works for GRAYSCALE or RGB565 images currently.

dtype can be b, B, or f for creating a signed 8-bit, unsigned 8-bit, or 32-bit floating point ndarray. GRAYSCALE images are directly converted to unsigned 8-bit ndarray objects. For signed 8-bit ndarray objects the values (0:255) are mapped to (-127:128). For float 32-bit ndarray objects the values are mapped to (0.0:255.0). RGB565 images are converted to 3-channel ndarray objects and the same process described above for GRAYSCALE images is applied to each channel depending on dtype. Note that dtype also accepts the integer values (e.g. ord()) of b, B, and f respectively.

buffer if not None is a bytearray object to use as the buffer for the ndarray. If None a new buffer is allocated on the heap to store the ndarray image data. You can use the buffer argument to directly allocate the ndarray in a pre-allocated buffer saving a heap allocation and a copy operation.

The ndarray returned has the shape of (height, width) for GRAYSCALE images and (height, width, 3) for RGB565 images.

to_bitmap(x_scale: float = 1.0, y_scale: float = 1.0, roi: Tuple[int, int, int, int] | None = None, rgb_channel: int = -1, alpha: int = 255, color_palette: int | Image | None = None, alpha_palette: Image | None = None, hint: int = 0, transform: ndarray | None = None, copy: bool = False, copy_to_fb: bool = False) → Image

Converts an image to a bitmap image (1 bit per pixel).

x_scale controls how much the displayed image is scaled by in the x direction (float). If this value is negative the image will be flipped horizontally. Note that if y_scale is not specified then it will match x_scale to maintain the aspect ratio.

y_scale controls how much the displayed image is scaled by in the y direction (float). If this value is negative the image will be flipped vertically. Note that if x_scale is not specified then it will match x_scale to maintain the aspect ratio.

roi is the region-of-interest rectangle tuple (x, y, w, h) of the source image to draw. This allows you to extract just the pixels in the ROI to scale and draw on the destination image.

rgb_channel is the RGB channel (0=R, G=1, B=2) to extract from an RGB565 image (if passed) and to render onto the destination image. For example, if you pass rgb_channel=1 this will extract the green channel of the source RGB565 image and draw that in grayscale on the destination image.

alpha controls how much of the source image to blend into the destination image. A value of 255 draws an opaque source image while a value lower than 255 produces a blend between the source and destination image. 0 results in no modification to the destination image.

color_palette if not None can be an a color palette enum or a 256 pixel in total RGB565 image to use as a color lookup table on the grayscale value of whatever the source image is. This is applied after rgb_channel extraction if used.

alpha_palette if not None can be a 256 pixel in total GRAYSCALE image to use as a alpha palette which modulates the alpha value of the source image being drawn at a pixel pixel level allowing you to precisely control the alpha value of pixels based on their grayscale value. A pixel value of 255 in the alpha lookup table is opaque which anything less than 255 becomes more transparent until 0. This is applied after rgb_channel extraction if used.

hint is a logical OR of the flags listed in Hint flags (excluding image.BLACK_BACKGROUND which is not supported here).

transform is 3x3 ndarray that is used to perform a persepective transformation on the image. Only supported on the OpenMV Cam N6 currently as it has a GPU that can do this in hardware.

copy if True create a deep-copy on the heap of the image that’s been converted versus converting the original image in-place.

copy_to_fb if True the image is loaded directly into the frame buffer. copy_to_fb has priority over copy. This has no special effect if the image is already in the frame buffer.

Note

Bitmap images are like grayscale images with only two pixels values - 0 and 1. Additionally, bitmap images are packed such that they only store 1 bit per pixel making them very small. The OpenMV image library allows bitmap images to be used in all places sensor.GRAYSCALE and sensor.RGB565 images can be used. However, many operations when applied on bitmap images don’t make any sense becuase bitmap images only have 2 values. OpenMV recommends using bitmap images for mask values in operations and such as they fit on the MicroPython heap quite easily. Finally, bitmap image pixel values 0 and 1 are interpreted as black and white when being applied to sensor.GRAYSCALE or sensor.RGB565 images. The library automatically handles conversion.

Returns the image object so you can call another method using . notation.

to_grayscale(x_scale: float = 1.0, y_scale: float = 1.0, roi: Tuple[int, int, int, int] | None = None, rgb_channel: int = -1, alpha: int = 255, color_palette: int | Image | None = None, alpha_palette: Image | None = None, hint: int = 0, transform: ndarray | None = None, copy: bool = False, copy_to_fb: bool = False) → Image

Converts an image to a grayscale image (8-bits per pixel).