deepspeed 0.15.0 on Python PyPI

What's Changed

Update version.txt after 0.14.5 release by @loadams in #5982
move pynvml install to setup.py by @Rohan138 in #5840
add moe topk(k>2) gate support by @inkcherry in #5881
Move inf_or_nan_tracker to cpu for cpu offload by @BacharL in #5826
Enable dynamic shapes for pipeline parallel engine inputs by @tohtana in #5481
Add and Remove ZeRO 3 Hooks by @jomayeri in #5658
DeepNVMe GDS by @jomayeri in #5852
Pin transformers version on nv-nightly by @loadams in #6002
DeepSpeed on Window blog by @tjruwase in #6364
Bug Fix 5880 by @jomayeri in #6378
Update linear.py compatible with torch 2.4.0 by @terry-for-github in #5811
GDS Swapping Fix by @jomayeri in #6386
Long sequence parallelism (Ulysses) integration with HuggingFace by @samadejacobs in #5774
reduce cpu host overhead when using moe by @ranzhejiang in #5578
fix fp16 Qwen2 series model to DeepSpeed-FastGen by @ZonePG in #6028
Add Japanese translation of Windows support blog by @tohtana in #6394
Correct op_builder path to xpu files for trigger XPU tests by @loadams in #6398
add pip install cutlass version check by @GuanhuaWang in #6393
[XPU] API align with new intel pytorch extension release by @YizhouZ in #6395
Pydantic v2 migration by @mrwyattii in #5167
Fix torch check by @loadams in #6402

Full Changelog: v0.14.5...v0.15.0