Writing your own ================ When the catalogue does not cover a model -- a research network whose output layout is bespoke, a tweak to an existing architecture, a tensor whose semantic interpretation is application-specific -- the application provides its own post-processor. The protocol is plain: a callable that takes ``(model, inputs, outputs)`` and returns whatever the application expects from :meth:`~ml.Model.predict`. A class with ``__call__`` is the conventional form:: class MyPostprocessor: def __init__(self, threshold=0.5): self.threshold = threshold def __call__(self, model, inputs, outputs): ... return result A plain function works too -- the engine only checks that the object is callable. Hooking it in ------------- Two attachment points. The ``postprocess=`` kwarg on the constructor binds the callable for every :meth:`~ml.Model.predict` call on the model:: model = ml.Model("/rom/my_model.tflite", postprocess=MyPostprocessor()) To override the binding for a single call -- swap decoders without re-loading the model -- pass ``callback=`` to predict directly:: result = model.predict([img], callback=MyOtherPostprocessor()) The callable signature is the same in either case. What the callable receives -------------------------- * ``model`` -- the :class:`~ml.Model` instance, useful for the quantization parameters (:attr:`~ml.Model.output_scale`, :attr:`~ml.Model.output_zero_point`, :attr:`~ml.Model.output_dtype`) and the input dimensions (:attr:`~ml.Model.input_shape`). * ``inputs`` -- the list of inputs the application passed to :meth:`~ml.Model.predict`. The first element is usually the bound :class:`~ml.preprocessing.Normalization` instance; its ``roi`` attribute is what :class:`~ml.utils.NMS` expects for remapping boxes back into the original image. * ``outputs`` -- the raw output tensors as a list of :class:`~ulab.numpy.ndarray` objects, in their native dtype. Float outputs arrive as-is; integer outputs arrive quantized. Quantized arithmetic -------------------- The shipped decoders all reach for the same helpers in :mod:`ml.utils`, and a custom one usually wants the same pattern: :func:`~ml.utils.quantize` lifts a float threshold into the model's quantized space, :func:`~ml.utils.threshold` filters without dequantizing the whole tensor, and :func:`~ml.utils.dequantize` runs once on the survivors. :func:`~ml.utils.sigmoid` and :func:`~ml.utils.logit` are available for networks whose output channels are pre-sigmoid logits (the MediaPipe detectors are the canonical case). For models with float outputs -- regression heads, models with a final dequantize layer baked in -- the quantization helpers pass through unchanged, so the same post-processor code works against either dtype without special-casing. Return value ------------ Whatever the callable returns is what :meth:`~ml.Model.predict` returns. For box-emitting decoders the convention is to push candidates through an :class:`~ml.utils.NMS` and return its per-class lists -- the call shape :doc:`non-max suppression ` documents and the :doc:`YOLOv8 walkthrough ` builds in context. For anything else, return whatever the application finds convenient: a single :class:`~ulab.numpy.ndarray`, a label string, a tuple of ``(class, score, embedding)``, a dictionary.