🌟 Summary
Release v8.3.44
focuses on improving Triton Inference Server integration with enhanced metadata support, optimizing model export functionality, and upgrading various other functionalities for flexibility, stability, and user experience. 🚀
📊 Key Changes
Triton Inference Enhancements:
- Added ability to retrieve and store model metadata during export (
on_export_end
callback). - Triton Model Repository configurations now dynamically include metadata (
config.pbtxt
). - Enhanced
TritonRemoteModel
to handle metadata for better customization and traceability. - Set a default task (
task=detect
) for Triton Server models when unspecified.
Core Features & Fixes:
- Dependency Change: Reverted from
lapx
back to the original, widely-usedlap
package for improved compatibility and stability. - Model Dynamics: Refined handling of dynamic ONNX models by intelligently setting the
dynamic
property based on input shape rather than batch size. - Custom Backend Flexibility: Expanded
AutoBackend
to directly accept PyTorch models in memory, in addition to weight file paths. - AMP Checks: Hardcoded failing GPUs (e.g., GTX 16 series, Quadro T series) to prevent NaN losses in training when using AMP (Automatic Mixed Precision).
- New Utility: Introduced
empty_like
function to streamline operations and improve maintainability for tensors and arrays. - Segment Resampling Fix: Resolved issues with preserving original segment points during resampling.
🎯 Purpose & Impact
💡 Triton Support Improvements:
- Enables smooth deployment configurations with enriched metadata providing better traceability and error prevention. Streamlines the export process for Triton Inference Server users.
🧠 User-Friendly Adjustments:
- Setting a default task reduces errors for users unfamiliar with Triton configurations.
- Allowing in-memory PyTorch models boosts integration convenience in customized workflows.
🚀 Stability and Performance Gains:
- Utilizing
lap
simplifies dependency management and ensures better compatibility in tracking workflows. - Fixed AMP-related compatibility issues for specific GPUs, preventing unstable training outcomes.
🔧 Code Optimization:
- Introduced the
empty_like
utility to centralize tensor creation logic, reducing repetitive code and improving maintainability.
This update bridges gaps in stability, scalability, and user experience for Triton deployment and core Ultralytics functionalities, ensuring a smoother and more reliable platform for all users! 🌟
What's Changed
- Revert
lapx
tolap
by @Laughing-q in #17908 - Preserve original points in
resample_segments
by @Y-T-G in #18051 - Hardcode failing GPUs in AMP checks by @Y-T-G in #17977
- Set
dynamic
to True only if imgsz is dynamic for ONNX by @Y-T-G in #17872 - Set
task
for Triton inferenceTriton by @Laughing-q in #18064 - Modified the parameter description of 'weights' in the AutoBackend class by @ye-yangshuo in #18059
- Fix
np.empty_like
andtorch.empty_like
input type by @Laughing-q in #18062 ultralytics 8.3.44
improve Triton Inference Server metadata by @Y-T-G in #17921
New Contributors
- @ye-yangshuo made their first contribution in #18059
Full Changelog: v8.3.43...v8.3.44