For OpenMV firmware v5.0.0 · based on MicroPython v1.28 · Docs built 07 May 2026

Machine vision,
made simple.

Live face detection, AprilTag tracking, QR scanning, and YOLO. All on-device in pure MicroPython. No host computer, no cloud.

Hello world

import csi
import time
import ml
from ml.postprocessing.ultralytics import YoloV8

csi0 = csi.CSI()
csi0.reset()
csi0.pixformat(csi.RGB565)
csi0.framesize(csi.VGA)

# Built-in single-class person detector model.
model = ml.Model("/rom/yolov8n_192.tflite",
                 postprocess=YoloV8(threshold=0.4))
clock = time.clock()

while True:
    clock.tick()
    img = csi0.snapshot()
    # predict returns a list per class of ((x, y, w, h), score) tuples.
    for class_dets in model.predict([img]):
        for rect, score in class_dets:
            img.draw_rectangle(rect, color=(0, 255, 0))
    print(clock.fps(), "fps")

Real-time person tracking

The on-board YOLOv8 model is a single-class person detector — int8 quantised and shipped in ROM.

Loaded from /rom/yolov8n_192.tflite — no SD card or download needed.
Runs in real time on NPU-equipped boards — the OpenMV N6 and AE3.
Train your own YOLOv8 in Ultralytics — same three lines of glue.
import csi
import math
import time

csi0 = csi.CSI()
csi0.reset()
csi0.pixformat(csi.RGB565)
csi0.framesize(csi.QVGA)
csi0.auto_gain(False)
csi0.auto_whitebal(False)

clock = time.clock()

while True:
    clock.tick()
    img = csi0.snapshot()
    for tag in img.find_apriltags():
        img.draw_detection(tag, color1=(255, 0, 0), color2=(0, 255, 0))
        deg = math.degrees(tag.rotation)
        print("ID %d  rotation %.1f deg" % (tag.id, deg))
    print(clock.fps(), "fps")

Locate and identify AprilTags

AprilTags are 2D fiducial markers — robust to motion blur and partial occlusion, and they give you full 3D pose.

Built-in detector — no model file or training needed.
Returns ID plus full 6-DoF pose — x/y/z translation and x/y/z rotation.
Use for robotics calibration, AR markers, and indoor positioning.
import csi
import time
import ml
from ml.postprocessing.mediapipe import BlazeFace

csi0 = csi.CSI()
csi0.reset()
csi0.pixformat(csi.RGB565)
csi0.framesize(csi.VGA)
csi0.window((400, 400))  # square window for best results

model = ml.Model("/rom/blazeface_front_128.tflite",
                 postprocess=BlazeFace(threshold=0.4))
clock = time.clock()

while True:
    clock.tick()
    img = csi0.snapshot()
    for rect, score, keypoints in model.predict([img]):
        img.draw_rectangle(rect, color=(0, 0, 255))
        ml.utils.draw_keypoints(img, keypoints, color=(255, 0, 0))
    print(clock.fps(), "fps")

Detect faces with BlazeFace

Google's BlazeFace is a lightweight TensorFlow Lite face detector that returns bounding boxes plus six landmarks per face.

Loaded from /rom/blazeface_front_128.tflite — pre-quantised, no download needed.
Six keypoints per face: eyes, nose, mouth, and ears.
No privacy concerns — frames never leave the camera.
import csi
import time

csi0 = csi.CSI()
csi0.reset()
csi0.pixformat(csi.RGB565)
csi0.framesize(csi.QVGA)
csi0.auto_gain(False)

clock = time.clock()

while True:
    clock.tick()
    img = csi0.snapshot()
    for code in img.find_qrcodes():
        img.draw_rectangle(code.rect, color=(255, 0, 0))
        print(code.payload)
    print(clock.fps(), "fps")

Scan QR codes from a live feed

The built-in QR decoder handles tilted, distorted, and partially occluded codes.

Each result also exposes version, ECC level, and corner coordinates.
Numeric, alphanumeric, binary, and Kanji data modes.
Returns the decoded payload as a Python string — ready to use.
import csi
import time

csi0 = csi.CSI()
csi0.reset()
csi0.pixformat(csi.RGB565)
csi0.framesize(csi.QVGA)
csi0.auto_gain(False)
csi0.auto_whitebal(False)

# LAB thresholds: (L_min, L_max, A_min, A_max, B_min, B_max)
thresholds = [
    (30, 100, 15, 127, 15, 127),   # red
    (30, 100, -64, -8, -32, 32),   # green
]

clock = time.clock()

while True:
    clock.tick()
    img = csi0.snapshot()
    for blob in img.find_blobs(thresholds, pixels_threshold=200):
        img.draw_rectangle(blob.rect, color=(255, 0, 0))
        img.draw_cross((blob.cx, blob.cy))
    print(clock.fps(), "fps")

Find blobs of color

find_blobs returns connected pixel regions matching one or more LAB thresholds.

Tune thresholds for your lighting — disable auto-gain and auto-whitebal first.
Pass multiple thresholds for multi-color tracking in one call.
pixels_threshold filters tiny detections; merge=True joins overlapping blobs.
import csi
import time

csi0 = csi.CSI()
csi0.reset()
csi0.pixformat(csi.GRAYSCALE)
csi0.framesize(csi.VGA)
csi0.window((640, 80))  # narrow strip for fast linear scanning
csi0.auto_gain(False)
csi0.auto_whitebal(False)

clock = time.clock()

while True:
    clock.tick()
    img = csi0.snapshot()
    for code in img.find_barcodes():
        img.draw_rectangle(code.rect, color=(0, 255, 0))
        print(code.payload, "(quality %d)" % code.quality)
    print(clock.fps(), "fps")

Read 1D barcodes

Find 1D barcodes anywhere in the frame and decode their payloads.

Powered by the ZBar library — recognises EAN, UPC, Code 39/93/128, Codabar, ITF, ISBN, and DataBar.
Use a windowed strip in grayscale for the fastest linear scan.
Each result has format, payload, rotation, corners, and a bounding rect.
import csi
import time
import ml
from ml.postprocessing.mediapipe import HandLandmarks

csi0 = csi.CSI()
csi0.reset()
csi0.pixformat(csi.RGB565)
csi0.framesize(csi.VGA)
csi0.window((400, 400))  # square window for the model

# Connections between the 21 keypoints — palm + 5 fingers.
hand_lines = ((0, 1), (1, 2), (2, 3), (3, 4), (0, 5), (5, 6),
              (6, 7), (7, 8), (5, 9), (9, 10), (10, 11), (11, 12),
              (9, 13), (13, 14), (14, 15), (15, 16), (13, 17), (17, 18),
              (18, 19), (19, 20), (0, 17))

model = ml.Model("/rom/hand_landmarks_full_224.tflite",
                 postprocess=HandLandmarks(threshold=0.4))
clock = time.clock()

while True:
    clock.tick()
    img = csi0.snapshot()
    # predict returns a list per hand: index 0 = left, index 1 = right.
    for detections in model.predict([img]):
        for rect, score, keypoints in detections:
            ml.utils.draw_skeleton(img, keypoints, hand_lines,
                                   kp_color=(255, 0, 0),
                                   line_color=(0, 255, 0))
    print(clock.fps(), "fps")

Track 21 hand keypoints

Google's MediaPipe Hand Landmarks model places 21 joints on each detected hand — wrist, knuckles, and fingertips.

Loaded from /rom/hand_landmarks_full_224.tflite — running standalone here, without palm detection upstream.
Returns one list per hand — index 0 is left, index 1 is right.
ml.utils.draw_skeleton draws all 21 joints and connections in one call.

New to OpenMV?

Start with the step-by-step tutorial — it covers hardware setup, the IDE, basic scripts, and tips for your first real project.

Core libraries

Cameras, image processing, ML, ndarrays, I/O, displays, multitasking, networking, and Bluetooth — all from MicroPython.

View all libraries →

Explore by board

Select your OpenMV Cam to see its pinout, specs, and board-specific quick reference.

View all supported boards →

More resources

Community & links