Hotfix: Fix crash when running multiple models simultaneously
Fixed a bug where the server process terminates when two or more models receive requests at the same time.
Symptom: Server crashes when multiple models are used concurrently (e.g., VLM as interface model + LLM for chat in Open WebUI)
Cause: Each model engine ran GPU operations on a separate thread, causing Metal command buffer races on Apple Silicon
Fix: All model GPU operations now run on a single shared thread. No impact on single-model performance.