NPUs ==== The H7 and the RT1062 run inference on a Cortex-M CPU through :doc:`TFLM ` and :doc:`CMSIS-NN `. The AE3 and the N6 add a dedicated NPU on the same die -- a tensor pipeline in fixed silicon that runs the heavy operators without occupying the CPU. The two NPUs in OpenMV's lineup come from different vendors and their toolchains are different, but the cam exposes both through the same :class:`ml.Model` API. What differs is the file on disk and the runtime that walks it. AE3 -- Arm Ethos-U55 -------------------- The AE3 carries an Arm *Ethos-U55* NPU on the same die as the Cortex-M55 application core. *Vela* is the offline compiler that prepares a model for it: Vela takes a standard ``.tflite`` in and emits a ``.tflite`` out whose NPU-eligible subgraphs have been folded into a custom *Ethos-U* operator carrying the byte commands the NPU runs. At inference time, :doc:`TFLM ` walks the file normally; the Ethos-U operator dispatches its byte commands through the Ethos-U driver, and any operator Vela did not fold falls back to :doc:`CMSIS-NN ` on the M55. N6 -- ST Neural-ART ------------------- The N6 carries ST's *Neural-ART* NPU and runs *STAI* -- ST's runtime for it -- in place of TFLM. *STEdgeAI* is the offline compiler: it takes a model in and emits a relocatable network blob laid out for the Neural-ART hardware. STAI loads the blob from :doc:`ROMFS <../ml-module/romfs>` and walks it directly on the NPU. Operator coverage is whatever STEdgeAI supports for the part. Same script, different cam -------------------------- Both NPUs expose the same input and output tensors with the same quantization parameters as a CPU-run model would. A script written against one cam runs on another by loading a model file prepared for that cam's NPU. Detection thresholds, ROI handling, and post-processor wiring -- the script-level decisions -- do not change.