Loading a model =============== :class:`ml.Model` reads a model file from flash, parses it, allocates the RAM the network needs during inference, and returns an object that carries everything the rest of the script needs to know about the loaded network. The constructor --------------- The constructor takes a path and an optional post-processor:: model = ml.Model("/rom/blazeface_front_128.tflite", postprocess=BlazeFace()) Models on ``/rom/`` (the flash-resident filesystem) are read in place: the network's weights stay in flash and the loaded model spends only the tensor arena's worth of RAM. Models on ``/sdcard/`` are copied into RAM at load time, so the total cost is model file size plus tensor arena. Either path works; the trade-off is RAM. If a sibling ``.txt`` file with the same basename exists, its contents are loaded into :attr:`~ml.Model.labels` automatically. The ``postprocess=`` keyword registers a callable that :meth:`~ml.Model.predict` runs after each inference. Read-only properties -------------------- A loaded model exposes a small set of read-only properties that describe the network without anyone running it. *File and memory.* * :attr:`~ml.Model.len` -- on-disk model file size, in bytes. * :attr:`~ml.Model.ram` -- size of the *tensor arena* the network needs for its intermediate activations during inference, in bytes. *Input tensors.* * :attr:`~ml.Model.input_shape` -- a list of tuples, one per input tensor, giving the shape the network expects. Vision networks have one input with shape ``(1, H, W, C)``. * :attr:`~ml.Model.input_dtype` -- list of single-character dtype codes (``'b'`` int8, ``'B'`` uint8, ``'h'`` int16, ``'H'`` uint16, ``'f'`` float32), one per input. * :attr:`~ml.Model.input_scale` and :attr:`~ml.Model.input_zero_point` -- the *quantization parameters* that convert between the real-valued input the network was trained on and the integer representation the cam runs against. *Output tensors.* Mirror of the input set: :attr:`~ml.Model.output_shape`, :attr:`~ml.Model.output_dtype`, :attr:`~ml.Model.output_scale`, :attr:`~ml.Model.output_zero_point`. Detection networks produce two or three output tensors (boxes, confidence scores, sometimes class probabilities); classification networks produce one. *Extras.* :attr:`~ml.Model.labels` is the class-name list loaded from the sibling ``.txt`` file, or :data:`None`. :attr:`~ml.Model.postprocess` is the registered post-processor, or :data:`None`. Inspecting BlazeFace -------------------- Loading the shipped BlazeFace model and printing each property gives the actual numbers:: import ml from ml.postprocessing.mediapipe import BlazeFace model = ml.Model("/rom/blazeface_front_128.tflite", postprocess=BlazeFace()) print("file size: ", model.len, "bytes") print("tensor arena: ", model.ram, "bytes") print("input shape: ", model.input_shape) print("input dtype: ", model.input_dtype) print("input scale: ", model.input_scale) print("input zp: ", model.input_zero_point) print("output shape: ", model.output_shape) print("output dtype: ", model.output_dtype) print("output scale: ", model.output_scale) print("output zp: ", model.output_zero_point) The numbers identify the network's interface concretely: a single ``(1, 128, 128, 3)`` ``int8`` input tensor and two ``int8`` outputs -- one for box-regression coefficients, one for per-anchor confidence scores. The quantization parameters describe how those int8 values map to the real floats the network was trained against; the post-processor uses them to undo the quantization before decoding the boxes. Every property is the single source of truth for what it describes. Scripts read :attr:`~ml.Model.input_shape` to know what to capture at, read :attr:`~ml.Model.output_scale` and :attr:`~ml.Model.output_zero_point` to decode tensors by hand, and read :attr:`~ml.Model.labels` for human-readable class names -- never hard-coded, never assumed.