github roboflow/supervision 0.28.0
supervision-0.28.0: CompactMask & SAM3

7 hours ago

🔦 Spotlight

Memory-efficient masks with sv.CompactMask

Segmentation models produce one full-resolution bitmap per instance. On a 1920×1080 image with 28 detections that is ~55 MB of mask data. Most pixels are background. sv.CompactMask stores only the tight bounding-box crop, RLE-encoded — the same 28 masks drop to ~237 KB of crops, a 240× reduction before RLE kicks in.

It's a drop-in replacement: annotators, filters, and area all work unchanged.

supervision-sam3
import supervision as sv

# any segmentation model — RF-DETR Seg, YOLO-Seg, SAM3
detections = model.predict(image)  # sv.Detections with dense masks

dense_mb = detections.mask.nbytes / 1024 / 1024
compact = sv.CompactMask.from_dense(
    masks=detections.mask,
    xyxy=detections.xyxy,
    image_shape=image.shape[:2],
)
detections.mask = compact  # swap in — API unchanged

# filter by pixel area without materialising dense masks
large = detections[compact.area > 1000]

# annotators call .to_dense() internally
annotated = sv.MaskAnnotator().annotate(image.copy(), detections)

SAM3 text-prompted segmentation

SAM3 segments objects by free-text prompt — no class list, no bounding boxes. sv.Detections.from_sam3() parses both PCS (multi-prompt) and PVS (video) response formats into a standard sv.Detections, with class_id set to the prompt index.

import requests, base64
import supervision as sv

PROMPTS = ["person", "bag"]

with open("image.jpg", "rb") as f:
    img_b64 = base64.b64encode(f.read()).decode()

response = requests.post(
    f"https://api.roboflow.com/inferenceproxy/seg-preview?api_key={API_KEY}",
    json={
        "image": {"type": "base64", "value": img_b64},
        "prompts": [{"type": "text", "text": p} for p in PROMPTS],
    },
    headers={"Content-Type": "application/json"},
)
sam3_result = response.json()

h, w = cv2.imread("image.jpg").shape[:2]
detections = sv.Detections.from_sam3(sam3_result=sam3_result, resolution_wh=(w, h))
# class_id == 0 → "person", class_id == 1 → "bag"

🔄 Migration

VideoInfo.fps is now float

NTSC frame rates (23.976, 29.97, 59.94) were silently truncated. fps is now the true float — cast at call sites that need an integer.

Before
info = sv.VideoInfo.from_video_path("clip.mp4")
buf = collections.deque(maxlen=info.fps)
trace = sv.TraceAnnotator(trace_length=info.fps)
After
info = sv.VideoInfo.from_video_path("clip.mp4")
buf = collections.deque(maxlen=int(info.fps))
trace = sv.TraceAnnotator(trace_length=int(info.fps))

sv.ByteTrack deprecated — use ByteTrackTracker

Tracker implementations now live in the dedicated trackers package. sv.ByteTrack remains available in 0.28–0.29 with DeprecationWarning; removal in 0.30.0.

Before
tracker = sv.ByteTrack()
detections = tracker.update_with_detections(detections)
After
# pip install trackers
from trackers import ByteTrackTracker

tracker = ByteTrackTracker()
detections = tracker.update(detections)

🚀 Added

  • Memory-efficient masks with sv.CompactMask. Sparse segmentation masks are now stored as a crop region plus RLE-encoded data instead of full-resolution bitmaps, cutting memory use by 10–100× for typical instance-segmentation outputs. It's a drop-in change — sv.Detections.mask, filtering, merging, and area all keep working without materialising the full array. (#2159)

  • SAM3 detection and PVS support in from_inference. sv.Detections.from_inference now parses SAM3 detection and point-video-segmentation outputs, both from the local inference package and from Roboflow-hosted server responses. (#2103, #2152)

  • Compressed COCO RLE masks in from_inference. Inference responses with rle or rle_mask fields containing a compressed counts string (as produced by pycocotools) are decoded directly into binary masks, skipping the lossy polygon round-trip. (#2178)

  • Standard logging module instead of print. Diagnostic output is now emitted under the supervision logger, so applications can capture, filter, or silence it through standard logging configuration. (#2154)

  • RGBA hex codes in sv.Color. sv.Color.from_hex accepts 8-digit hex (#ff00ff80), and Color.as_hex() round-trips alpha when not fully opaque. New top-level helpers: sv.hex_to_rgba, sv.rgba_to_hex, and sv.is_valid_hex. (#2004)

  • Dynamic kernel sizing in blur and pixelate annotators. BlurAnnotator(kernel_size=None) and PixelateAnnotator(pixel_size=None) (the new default) compute the kernel per detection as a fraction of the shorter bounding-box side, giving visually consistent results across object scales. (#709)

  • sv.ImageAssets for sample images. A counterpart to the existing video assets — downloads sample images for examples and tutorials. (#932)

  • Boundary warnings in InferenceSlicer. Emits a warning when callback detections fall outside tile boundaries, helping you spot coordinate-system bugs in custom callbacks early. (#2186)

⚠️ Breaking Changes

  • sv.VideoInfo.fps is now float, not int. Frame rates like 23.976, 29.97, and 59.94 are no longer truncated. If you pass fps to APIs that require an integer (deque(maxlen=...), TraceAnnotator(trace_length=...)), wrap with int(...). (#2210)

  • sv.rle_to_mask returns bool, not uint8. This matches the long-declared signature. Code that does mask * 255 still works via NumPy broadcasting, but explicit casts like mask.view(np.uint8) will break. Add .astype(np.uint8) if you relied on the undocumented integer output. (#2178)

See the migration guide below for before/after snippets.

🌱 Changed

  • Metric arrays use float32 instead of float64. sv.MeanAveragePrecisionResult and related arrays (mAP_scores, ap_per_class, iou_thresholds, precision/recall) drop to float32, reducing memory and speeding up computation. Numerical results may differ in the last few digits. (#2169)

  • rle_to_mask and mask_to_rle moved. New canonical path: supervision.detection.utils.converters. The old supervision.dataset.utils import still works but is deprecated. (#2178)

🗑️ Deprecated

  • normalized_xyxy argument renamed to xyxy in denormalize_boxes. sv.denormalize_boxes(normalized_xyxy=...) still works but emits a FutureWarning; switch to xyxy=. Scheduled for removal in 0.30.0.

  • sv.ByteTrackByteTrackTracker (external trackers package). Install with pip install trackers; the method renames from update_with_detections() to update(). Scheduled for removal in 0.30.0. (#2215)

  • supervision.keypointsupervision.key_points. Also deprecated: the LMM enum (use VLM), from_lmm (use from_vlm), create_tiles in supervision.utils.image, ensure_cv2_image_for_processing in supervision.utils.conversion, and the keypoint validators in supervision.validators. (#2214)

🔧 Fixed

  • PolygonZone no longer double-counts overlapping zones. When two polygons contain the same anchor, each zone now reflects its own containment instead of every zone claiming the detection. (#1991)

  • LineZone respects class identity across reused tracker IDs. Trackers that recycle tracker_id across classes no longer leak crossing state from one object to another. (#1868)

  • process_video raises immediately on callback errors. Previously the exception was swallowed and the process hung until the writer was flushed. (#2022)

  • DetectionDataset populates class_name. Loaded annotations now carry data["class_name"], matching what model connectors produce. (#2156)

  • ByteTrack preserves externally assigned tracker_id. No longer overwrites caller-assigned IDs on the first update. (#1364)

  • Confusion matrix double-counting fixed. evaluate_detection_batch now correctly matches multiple predictions to the same target, so FP/FN counts match expectations. (#1853)

  • MeanAverageRecall mAR@K is now COCO-compliant. Computed using top-K detections per image; previous values were inflated relative to pycocotools. (#2136)

  • Detections.is_empty() handles empty tracker_id. Returns True for zero-row detections regardless of whether tracker_id is None or an empty array. (#2209)

  • CSVSink and JSONSink slice custom_data per row. NumPy arrays, lists, and tuples whose length matches the detection count are now indexed per row, instead of being written whole for every detection. (#2199, #2216)

  • TraceAnnotator smooth mode handles stationary tracks. Deduplicates anchor points and falls back to a raw polyline when splprep cannot fit fewer than 4 unique points. (#2217)

  • load_coco_annotations rejects path-traversal annotations. Refuses file_name entries that escape the images directory via ../ or absolute paths. (#2218)

  • OBB datasets no longer blow up memory. Loading oriented-bounding-box datasets stopped allocating full-image masks per box. (#2187)

  • KeyPoints boolean mask indexing fixed. Uniform-count selection now works correctly when all instances share the same keypoint count. (#2188)

  • DetectionDataset.as_coco() preserves area and iscrowd. No longer dropped silently in the round-trip. (#2185)

  • force_mask=True precision and COCO empty-polygon export. Annotation conversion no longer loses precision, and COCO export tolerates empty polygons across formats. (#1746, #1086, #265)


🏆 Contributors

A huge thank you to everyone who shipped this release:


Full changelog: 0.27.0...0.28.0

Don't miss a new supervision release

NewReleases is sending notifications on new releases.