7.11. NPUs¶
The H7 and the RT1062 run inference on a Cortex-M CPU through
TFLM and CMSIS-NN. The AE3 and the
N6 add a dedicated NPU on the same die – a tensor pipeline in
fixed silicon that runs the heavy operators without occupying the
CPU. The two NPUs in OpenMV’s lineup come from different vendors
and their toolchains are different, but the cam exposes both
through the same ml.Model API. What differs is the file
on disk and the runtime that walks it.
7.11.1. AE3 – Arm Ethos-U55¶
The AE3 carries an Arm Ethos-U55 NPU on the same die as the
Cortex-M55 application core. Vela is the offline compiler that
prepares a model for it: Vela takes a standard .tflite in and
emits a .tflite out whose NPU-eligible subgraphs have been
folded into a custom Ethos-U operator carrying the byte commands
the NPU runs. At inference time, TFLM walks the file
normally; the Ethos-U operator dispatches its byte commands
through the Ethos-U driver, and any operator Vela did not fold
falls back to CMSIS-NN on the M55.
7.11.2. N6 – ST Neural-ART¶
The N6 carries ST’s Neural-ART NPU and runs STAI – ST’s runtime for it – in place of TFLM. STEdgeAI is the offline compiler: it takes a model in and emits a relocatable network blob laid out for the Neural-ART hardware. STAI loads the blob from ROMFS and walks it directly on the NPU. Operator coverage is whatever STEdgeAI supports for the part.
7.11.3. Same script, different cam¶
Both NPUs expose the same input and output tensors with the same quantization parameters as a CPU-run model would. A script written against one cam runs on another by loading a model file prepared for that cam’s NPU. Detection thresholds, ROI handling, and post-processor wiring – the script-level decisions – do not change.