`ml.utils` --- ML 工具¶

ml.utils 模块包含用于机器学习的实用类和函数。

函数¶

ml.utils.logit(x: ndarray) → ndarray¶: 返回所传入 ndarray 中所有值的 logit。

ml.utils.sigmoid(x: ndarray) → ndarray¶: 返回所传入 ndarray 中所有值的 sigmoid。

ml.utils.threshold(scores: ndarray, threshold: float, scale: float, find_max: bool = False, find_max_axis: int = 1) → ndarray¶

用量化后的 threshold 对 scores（一个由 int8、uint8、int16 或 uint16 组成的量化 ndarray）进行阈值处理，并返回一个包含所有通过阈值的索引的 ndarray。

会测试 scale 以判断反量化后的值为正还是为负。当 scale > 0 时返回 scores > threshold 的索引；否则返回 scores < threshold 的索引。

find_max 如果为 True，会在内部将 scores 替换为沿 find_max_axis 取最大值（或最小值，取决于 scale）的 ndarray。

find_max_axis 是当 find_max 为 True 时，沿其计算最大值/最小值归约的轴。

ml.utils.quantize(model: ml.Model, value: ndarray, index: int = 0) → ndarray¶

将所传入的 ndarray 除以缩放系数并加上模型的零点进行转换。当 index 处的模型输出 dtype 为 float 时，返回未改变的 value。

model 是其输出量化参数被使用的模型。

value 是要量化的 ndarray。

index 用于选择针对 model 的哪个张量输出进行量化。

ml.utils.dequantize(model: ml.Model, value: ndarray, index: int = 0) → ndarray¶

将所传入的 ndarray 减去零点再乘以模型的缩放系数进行转换。当 index 处的模型输出 dtype 为 float 时，返回未改变的 value。

model 是其输出量化参数被使用的模型。

value 是要反量化的 ndarray。

index 用于选择针对 model 的哪个张量输出进行反量化。

ml.utils.draw_predictions(image: image.Image, boxes: list[tuple[float, float, float, float]], labels: list[str], colors: list[tuple[int, int, int]], scores: list[float] | None = None, format: str = 'pascal_voc', font_width: int = 8, font_height: int = 10, text_color: tuple[int, int, int] = (255, 255, 255)) → None¶

在 image 上绘制带文本标签的边界框（或中心点标记）。

boxes 是一个 (x, y, w, h) 元组的列表。

labels 是一个标签字符串的列表，每个边界框对应一个。

colors 是一个 (r, g, b) 元组的列表，每个边界框对应一个。

scores 如果不为 None，则是一个每个边界框的置信度分数列表。提供时，每个绘制的标签会附加上以 " %.2f" 格式化的分数。

format 控制如何解释边界框坐标：

"pascal_voc" —— 范围在 0.0 到 1.0 之间的归一化 (xmin, ymin, xmax, ymax)。

"point" —— 绝对像素 (x, y, w, h)；会在边界框中心绘制一个实心圆标记而非矩形（对中心点检测器很有用）。

任何其他值 —— 绝对像素 (x, y, w, h)；绘制为矩形。

font_width 是标签中每个字符的宽度（以像素为单位）。

font_height 是标签背景的高度（以像素为单位）。

text_color 是用于标签文本的 (r, g, b) 颜色。

ml.utils.draw_keypoints(image: image.Image, keypoints: ndarray, radius: int = 4, color: tuple[int, int, int] = (255, 0, 0), thickness: int = 1, fill: bool = False) → None¶

在 image 上绘制一个由关键点 (x, y, ...) 值组成的 ndarray。

radius 是关键点圆的半径。当 radius == 0 时，关键点会被绘制为单个像素。

color 是关键点的 (r, g, b) 颜色。

thickness 是圆的轮廓粗细。

fill 如果为 True，则填充关键点圆。

ml.utils.draw_skeleton(image: image.Image, keypoints: ndarray, lines: list[tuple[int, int]], kp_radius: int = 4, kp_color: tuple[int, int, int] = (255, 0, 0), kp_thickness: int = 1, kp_fill: bool = False, line_color: tuple[int, int, int] = (0, 255, 0), line_thickness: int = 1) → None¶

在 image 上绘制一个由关键点 (x, y, ...) 值组成的 ndarray，然后用线段将它们连接起来。

lines 是一个 (kp0_idx, kp1_idx) 元组的列表，指定要连接哪些关键点对。

kp_radius 是关键点圆的半径（传递给 draw_keypoints）。

kp_color 是关键点的 (r, g, b) 颜色。

kp_thickness 是关键点圆的轮廓粗细。

kp_fill 如果为 True，则填充关键点圆。

line_color 是线条的 (r, g, b) 颜色。

line_thickness 是线条的粗细。

class NMS —— 软非极大值抑制¶

NMS 对象收集一系列带有关联分数的边界框，通过 Soft-NMS 过滤掉分数较低的重叠边界框，并将在子窗口中检测到的边界框重新映射回原始图像坐标。

class ml.utils.NMS(window_w: int, window_h: int, roi: tuple[int, int, int, int])¶

创建一个 NMS 对象。

window_w 和 window_h 是模型输入张量/窗口的宽度和高度。

roi 是运行模型所针对的原始图像的 (x, y, w, h) 感兴趣区域（ROI）（通常由 Normalization() 对象返回）。用于将检测到的边界框重新映射回原始图像坐标空间。roi[2] 和 roi[3] 必须 >= 1。

add_bounding_box(xmin: float, ymin: float, xmax: float, ymax: float, score: float, label_index: int, keypoints: ndarray | None = None) → None¶

向 NMS 对象添加一个边界框。score 超出 [0.0, 1.0] 范围，或裁剪后宽度或高度为零/负值的边界框会被丢弃。

xmin、ymin、xmax、ymax 是窗口像素空间中的边界框坐标，被裁剪到 [0, window_w] / [0, window_h]。

score 是边界框的置信度分数（0.0-1.0）。

label_index 是与该边界框关联的类别标签的索引。

keypoints 是一个可选的、与该边界框关联的关键点 (x, y, ...) 值的 ndarray。

get_bounding_boxes(threshold: float = 0.1, sigma: float = 0.1) → list[list[tuple]]¶

对所有已添加的边界框执行 Soft-NMS，并返回一个按 label_index 索引的、每个类别一个列表的列表。每个内层列表包含映射回原始图像坐标的 ((x, y, w, h), score) 元组。如果在添加时提供了 keypoints，则元组会扩展为包含重新映射后的 keypoints ndarray。

调用此方法后，请创建一个新的 NMS 对象来处理一组新的边界框。

threshold 是边界框在 Soft-NMS 抑制后必须保留的最低分数，方能被保留。

sigma 控制用于惩罚重叠边界框分数的高斯函数。sigma 越小，抑制越激进。sigma <= 0.0 会禁用高斯惩罚（重叠边界框的分数不会衰减）。

ml.utils --- ML 工具¶

函数¶

class NMS —— 软非极大值抑制¶

`ml.utils` --- ML 工具¶