🔦 Spotlight
Memory-efficient masks with sv.CompactMask
Segmentation models produce one full-resolution bitmap per instance. On a 1920×1080 image with 28 detections that is ~55 MB of mask data. Most pixels are background. sv.CompactMask stores only the tight bounding-box crop, RLE-encoded — the same 28 masks drop to ~237 KB of crops, a 240× reduction before RLE kicks in.
It's a drop-in replacement: annotators, filters, and area all work unchanged.
import supervision as sv
# any segmentation model — RF-DETR Seg, YOLO-Seg, SAM3
detections = model.predict(image) # sv.Detections with dense masks
dense_mb = detections.mask.nbytes / 1024 / 1024
compact = sv.CompactMask.from_dense(
masks=detections.mask,
xyxy=detections.xyxy,
image_shape=image.shape[:2],
)
detections.mask = compact # swap in — API unchanged
# filter by pixel area without materialising dense masks
large = detections[compact.area > 1000]
# annotators call .to_dense() internally
annotated = sv.MaskAnnotator().annotate(image.copy(), detections)SAM3 text-prompted segmentation
SAM3 segments objects by free-text prompt — no class list, no bounding boxes. sv.Detections.from_sam3() parses both PCS (multi-prompt) and PVS (video) response formats into a standard sv.Detections, with class_id set to the prompt index.
import requests, base64
import supervision as sv
PROMPTS = ["person", "bag"]
with open("image.jpg", "rb") as f:
img_b64 = base64.b64encode(f.read()).decode()
response = requests.post(
f"https://api.roboflow.com/inferenceproxy/seg-preview?api_key={API_KEY}",
json={
"image": {"type": "base64", "value": img_b64},
"prompts": [{"type": "text", "text": p} for p in PROMPTS],
},
headers={"Content-Type": "application/json"},
)
sam3_result = response.json()
h, w = cv2.imread("image.jpg").shape[:2]
detections = sv.Detections.from_sam3(sam3_result=sam3_result, resolution_wh=(w, h))
# class_id == 0 → "person", class_id == 1 → "bag"🔄 Migration
VideoInfo.fps is now float
NTSC frame rates (23.976, 29.97, 59.94) were silently truncated. fps is now the true float — cast at call sites that need an integer.
Before
info = sv.VideoInfo.from_video_path("clip.mp4")
buf = collections.deque(maxlen=info.fps)
trace = sv.TraceAnnotator(trace_length=info.fps)
After
info = sv.VideoInfo.from_video_path("clip.mp4")
buf = collections.deque(maxlen=int(info.fps))
trace = sv.TraceAnnotator(trace_length=int(info.fps))
sv.ByteTrack deprecated — use ByteTrackTracker
Tracker implementations now live in the dedicated trackers package. sv.ByteTrack remains available in 0.28–0.29 with DeprecationWarning; removal in 0.30.0.
Before
tracker = sv.ByteTrack()
detections = tracker.update_with_detections(detections)
After
# pip install trackers
from trackers import ByteTrackTracker
tracker = ByteTrackTracker()
detections = tracker.update(detections)
🚀 Added
-
Memory-efficient masks with
sv.CompactMask. Sparse segmentation masks are now stored as a crop region plus RLE-encoded data instead of full-resolution bitmaps, cutting memory use by 10–100× for typical instance-segmentation outputs. It's a drop-in change —sv.Detections.mask, filtering, merging, andareaall keep working without materialising the full array. (#2159) -
SAM3 detection and PVS support in
from_inference.sv.Detections.from_inferencenow parses SAM3 detection and point-video-segmentation outputs, both from the localinferencepackage and from Roboflow-hosted server responses. (#2103, #2152) -
Compressed COCO RLE masks in
from_inference. Inference responses withrleorrle_maskfields containing a compressed counts string (as produced bypycocotools) are decoded directly into binary masks, skipping the lossy polygon round-trip. (#2178) -
Standard
loggingmodule instead ofprint. Diagnostic output is now emitted under thesupervisionlogger, so applications can capture, filter, or silence it through standardloggingconfiguration. (#2154) -
RGBA hex codes in
sv.Color.sv.Color.from_hexaccepts 8-digit hex (#ff00ff80), andColor.as_hex()round-trips alpha when not fully opaque. New top-level helpers:sv.hex_to_rgba,sv.rgba_to_hex, andsv.is_valid_hex. (#2004) -
Dynamic kernel sizing in blur and pixelate annotators.
BlurAnnotator(kernel_size=None)andPixelateAnnotator(pixel_size=None)(the new default) compute the kernel per detection as a fraction of the shorter bounding-box side, giving visually consistent results across object scales. (#709) -
sv.ImageAssetsfor sample images. A counterpart to the existing video assets — downloads sample images for examples and tutorials. (#932) -
Boundary warnings in
InferenceSlicer. Emits a warning when callback detections fall outside tile boundaries, helping you spot coordinate-system bugs in custom callbacks early. (#2186)
⚠️ Breaking Changes
-
sv.VideoInfo.fpsis nowfloat, notint. Frame rates like 23.976, 29.97, and 59.94 are no longer truncated. If you passfpsto APIs that require an integer (deque(maxlen=...),TraceAnnotator(trace_length=...)), wrap withint(...). (#2210) -
sv.rle_to_maskreturnsbool, notuint8. This matches the long-declared signature. Code that doesmask * 255still works via NumPy broadcasting, but explicit casts likemask.view(np.uint8)will break. Add.astype(np.uint8)if you relied on the undocumented integer output. (#2178)
See the migration guide below for before/after snippets.
🌱 Changed
-
Metric arrays use
float32instead offloat64.sv.MeanAveragePrecisionResultand related arrays (mAP_scores,ap_per_class,iou_thresholds, precision/recall) drop tofloat32, reducing memory and speeding up computation. Numerical results may differ in the last few digits. (#2169) -
rle_to_maskandmask_to_rlemoved. New canonical path:supervision.detection.utils.converters. The oldsupervision.dataset.utilsimport still works but is deprecated. (#2178)
🗑️ Deprecated
-
normalized_xyxyargument renamed toxyxyindenormalize_boxes.sv.denormalize_boxes(normalized_xyxy=...)still works but emits aFutureWarning; switch toxyxy=. Scheduled for removal in 0.30.0. -
sv.ByteTrack→ByteTrackTracker(externaltrackerspackage). Install withpip install trackers; the method renames fromupdate_with_detections()toupdate(). Scheduled for removal in 0.30.0. (#2215) -
supervision.keypoint→supervision.key_points. Also deprecated: theLMMenum (useVLM),from_lmm(usefrom_vlm),create_tilesinsupervision.utils.image,ensure_cv2_image_for_processinginsupervision.utils.conversion, and the keypoint validators insupervision.validators. (#2214)
🔧 Fixed
-
PolygonZoneno longer double-counts overlapping zones. When two polygons contain the same anchor, each zone now reflects its own containment instead of every zone claiming the detection. (#1991) -
LineZonerespects class identity across reused tracker IDs. Trackers that recycletracker_idacross classes no longer leak crossing state from one object to another. (#1868) -
process_videoraises immediately on callback errors. Previously the exception was swallowed and the process hung until the writer was flushed. (#2022) -
DetectionDatasetpopulatesclass_name. Loaded annotations now carrydata["class_name"], matching what model connectors produce. (#2156) -
ByteTrackpreserves externally assignedtracker_id. No longer overwrites caller-assigned IDs on the first update. (#1364) -
Confusion matrix double-counting fixed.
evaluate_detection_batchnow correctly matches multiple predictions to the same target, so FP/FN counts match expectations. (#1853) -
MeanAverageRecallmAR@K is now COCO-compliant. Computed using top-K detections per image; previous values were inflated relative topycocotools. (#2136) -
Detections.is_empty()handles emptytracker_id. ReturnsTruefor zero-row detections regardless of whethertracker_idisNoneor an empty array. (#2209) -
CSVSinkandJSONSinkslicecustom_dataper row. NumPy arrays, lists, and tuples whose length matches the detection count are now indexed per row, instead of being written whole for every detection. (#2199, #2216) -
TraceAnnotatorsmooth mode handles stationary tracks. Deduplicates anchor points and falls back to a raw polyline whensplprepcannot fit fewer than 4 unique points. (#2217) -
load_coco_annotationsrejects path-traversal annotations. Refusesfile_nameentries that escape the images directory via../or absolute paths. (#2218) -
OBB datasets no longer blow up memory. Loading oriented-bounding-box datasets stopped allocating full-image masks per box. (#2187)
-
KeyPointsboolean mask indexing fixed. Uniform-count selection now works correctly when all instances share the same keypoint count. (#2188) -
DetectionDataset.as_coco()preservesareaandiscrowd. No longer dropped silently in the round-trip. (#2185) -
force_mask=Trueprecision and COCO empty-polygon export. Annotation conversion no longer loses precision, and COCO export tolerates empty polygons across formats. (#1746, #1086, #265)
🏆 Contributors
A huge thank you to everyone who shipped this release:
- @Erol444 — SAM3 detection and PVS parsing
- @leeclemnet (LinkedIn) — compressed COCO RLE masks and
rle_to_maskcorrectness - @abritton2002 —
VideoInfo.fpsas float andDetections.is_empty()fix - @shaun0927 (LinkedIn) — sink slicing, trace annotator, COCO path-traversal hardening
- @happyhj (LinkedIn) —
class_nameinDetectionDataset - @farukalamai (LinkedIn) —
CSVSinkNumPy slicing - @stop1one (LinkedIn) — COCO-compliant
MeanAverageRecall - @Adithi-Sreenath (LinkedIn) —
PolygonZoneoverlap fix - @JESUSROYETH —
LineZoneclass-aware tracker IDs - @realh4m —
process_videoerror propagation - @rolson24 (LinkedIn) —
ByteTrackpreserves external tracker IDs - @panagiotamoraiti (LinkedIn) — confusion matrix correctness
- @Youho99, @kirilllzaitsev — COCO empty polygons and
force_masksconsistency - @aza-ali — RGBA hex support in
sv.Color - @Clemens-E — dynamic kernel sizing for blur and pixelate annotators
- @NickHerrig (LinkedIn) —
sv.ImageAssets - @0xD4rky —
force_mask=Trueprecision fix - @Borda (LinkedIn) —
CompactMask, metrics float32, deprecations
Full changelog: 0.27.0...0.28.0