7.4. Loading a model¶
ml.Model reads a model file from flash, parses it, allocates
the RAM the network needs during inference, and returns an object
that carries everything the rest of the script needs to know about
the loaded network.
7.4.1. The constructor¶
The constructor takes a path and an optional post-processor:
model = ml.Model("/rom/blazeface_front_128.tflite",
postprocess=BlazeFace())
Models on /rom/ (the flash-resident filesystem) are read in
place: the network’s weights stay in flash and the loaded model
spends only the tensor arena’s worth of RAM. Models on /sdcard/
are copied into RAM at load time, so the total cost is model file
size plus tensor arena. Either path works; the trade-off is RAM.
If a sibling .txt file with the same basename exists, its
contents are loaded into labels automatically.
The postprocess= keyword registers a callable that
predict() runs after each inference.
7.4.2. Read-only properties¶
A loaded model exposes a small set of read-only properties that describe the network without anyone running it.
File and memory.
len– on-disk model file size, in bytes.ram– size of the tensor arena the network needs for its intermediate activations during inference, in bytes.
Input tensors.
input_shape– a list of tuples, one per input tensor, giving the shape the network expects. Vision networks have one input with shape(1, H, W, C).input_dtype– list of single-character dtype codes ('b'int8,'B'uint8,'h'int16,'H'uint16,'f'float32), one per input.input_scaleandinput_zero_point– the quantization parameters that convert between the real-valued input the network was trained on and the integer representation the cam runs against.
Output tensors. Mirror of the input set:
output_shape, output_dtype,
output_scale,
output_zero_point. Detection networks produce two
or three output tensors (boxes, confidence scores, sometimes class
probabilities); classification networks produce one.
Extras. labels is the class-name list loaded
from the sibling .txt file, or None.
postprocess is the registered post-processor, or
None.
7.4.3. Inspecting BlazeFace¶
Loading the shipped BlazeFace model and printing each property gives the actual numbers:
import ml
from ml.postprocessing.mediapipe import BlazeFace
model = ml.Model("/rom/blazeface_front_128.tflite",
postprocess=BlazeFace())
print("file size: ", model.len, "bytes")
print("tensor arena: ", model.ram, "bytes")
print("input shape: ", model.input_shape)
print("input dtype: ", model.input_dtype)
print("input scale: ", model.input_scale)
print("input zp: ", model.input_zero_point)
print("output shape: ", model.output_shape)
print("output dtype: ", model.output_dtype)
print("output scale: ", model.output_scale)
print("output zp: ", model.output_zero_point)
The numbers identify the network’s interface concretely: a single
(1, 128, 128, 3) int8 input tensor and two int8 outputs
– one for box-regression coefficients, one for per-anchor
confidence scores. The quantization parameters describe how those
int8 values map to the real floats the network was trained against;
the post-processor uses them to undo the quantization before
decoding the boxes.
Every property is the single source of truth for what it describes.
Scripts read input_shape to know what to capture
at, read output_scale and
output_zero_point to decode tensors by hand, and
read labels for human-readable class names –
never hard-coded, never assumed.