7.4. Loading a model¶

ml.Model reads a model file from flash, parses it, allocates the RAM the network needs during inference, and returns an object that carries everything the rest of the script needs to know about the loaded network.

7.4.1. The constructor¶

The constructor takes a path and an optional post-processor:

model = ml.Model("/rom/blazeface_front_128.tflite",
                 postprocess=BlazeFace())

Models on /rom/ (the flash-resident filesystem) are read in place: the network’s weights stay in flash and the loaded model spends only the tensor arena’s worth of RAM. Models on /sdcard/ are copied into RAM at load time, so the total cost is model file size plus tensor arena. Either path works; the trade-off is RAM.

If a sibling .txt file with the same basename exists, its contents are loaded into labels automatically. The postprocess= keyword registers a callable that predict() runs after each inference.

7.4.2. Read-only properties¶

A loaded model exposes a small set of read-only properties that describe the network without anyone running it.

File and memory.

len – on-disk model file size, in bytes.
ram – size of the tensor arena the network needs for its intermediate activations during inference, in bytes.

Input tensors.

input_shape – a list of tuples, one per input tensor, giving the shape the network expects. Vision networks have one input with shape (1, H, W, C).
input_dtype – list of single-character dtype codes ('b' int8, 'B' uint8, 'h' int16, 'H' uint16, 'f' float32), one per input.
input_scale and input_zero_point – the quantization parameters that convert between the real-valued input the network was trained on and the integer representation the cam runs against.

Output tensors. Mirror of the input set: output_shape, output_dtype, output_scale, output_zero_point. Detection networks produce two or three output tensors (boxes, confidence scores, sometimes class probabilities); classification networks produce one.

Extras. labels is the class-name list loaded from the sibling .txt file, or None. postprocess is the registered post-processor, or None.

7.4.3. Inspecting BlazeFace¶

Loading the shipped BlazeFace model and printing each property gives the actual numbers:

import ml
from ml.postprocessing.mediapipe import BlazeFace

model = ml.Model("/rom/blazeface_front_128.tflite",
                 postprocess=BlazeFace())

print("file size:    ", model.len, "bytes")
print("tensor arena: ", model.ram, "bytes")
print("input shape:  ", model.input_shape)
print("input dtype:  ", model.input_dtype)
print("input scale:  ", model.input_scale)
print("input zp:     ", model.input_zero_point)
print("output shape: ", model.output_shape)
print("output dtype: ", model.output_dtype)
print("output scale: ", model.output_scale)
print("output zp:    ", model.output_zero_point)

The numbers identify the network’s interface concretely: a single (1, 128, 128, 3) int8 input tensor and two int8 outputs – one for box-regression coefficients, one for per-anchor confidence scores. The quantization parameters describe how those int8 values map to the real floats the network was trained against; the post-processor uses them to undo the quantization before decoding the boxes.

Every property is the single source of truth for what it describes. Scripts read input_shape to know what to capture at, read output_scale and output_zero_point to decode tensors by hand, and read labels for human-readable class names – never hard-coded, never assumed.