Envoy AI Gateway v1.0.0 — General Availability
Envoy AI Gateway v1.0.0 marks General Availability. With this release the core control-plane API — AIGatewayRoute, AIServiceBackend, BackendSecurityPolicy, GatewayConfig, and MCPRoute, all served at v1beta1 — is declared stable: within the 1.x series we will not make breaking changes to it unless required by a critical security fix, and any such change will ship with a documented migration path. Upgrading from v0.7 requires no changes to your resources. 1.0 brings together everything built since the first release in February 2025: a single OpenAI-compatible API across 16 providers with cross-provider translation, a full Model Context Protocol gateway, multimodal and audio endpoints, enterprise-grade observability, and multi-tenant, quota-aware routing — all as an additive layer on CNCF Envoy Gateway.
🎉 What 1.0 Means
1.0 is a commitment, not just another feature release. From the first release we said the major version would arrive once we had a first stable control-plane API. That moment is here. General Availability means:
- A stable API. Your
v1beta1resources will not break under you within the 1.x series. - Predictable upgrades. Upgrading the controller will not break a valid, migrated configuration; any change requiring action ships with a documented path.
- A complete platform. Everything assembled since v0.1 is now production-ready on the proven Envoy Gateway foundation.
Our API stability commitment
For stable releases, we will never break the APIs unless there is a critical security issue, and we will always provide a migration path in the release notes if we ever must. Following Semantic Versioning, the v1beta1 control-plane API remains backward compatible for the entire 1.x series — breaking changes would only ever land in a future 2.0. See the full support policy.
✨ The 1.0 Feature Surface
These are the capabilities the stable 1.0 control plane brings together.
A Stable, Versioned Control-Plane API
- Core CRDs now covered by the stability guarantee — The control-plane API you build on —
AIGatewayRoute,AIServiceBackend,BackendSecurityPolicy,GatewayConfig, andMCPRoute, all served atv1beta1— is now a stable contract. Within the 1.x series these APIs will not change in a breaking way unless required by a critical security fix, and any such change will ship with a documented migration path.
Universal LLM Access
- One OpenAI-compatible API across 16 providers — Reach OpenAI, Azure OpenAI, Google Gemini, Google Vertex AI, AWS Bedrock, Anthropic, Mistral, Cohere, Groq, Together AI, DeepInfra, DeepSeek, Hunyuan, SambaNova, Grok, and the Tetrate Agent Router Service through a single endpoint. Switch or mix providers without changing client code.
- Cross-provider request/response translation — Translate between provider protocols transparently — Anthropic
/v1/messagesto OpenAI/v1/chat/completions, and Anthropic Messages to AWS Bedrock Converse and InvokeModel — including streaming, tool use, reasoning/thinking blocks, and images. - Model virtualization with
modelNameOverride— Expose stable, application-facing model names while the gateway maps them to provider-specific models, enabling A/B testing, gradual migrations, and multi-provider strategies without touching client code.
Full Endpoint Coverage
- Chat, completions, embeddings, and images —
/v1/chat/completions,/v1/completions,/v1/embeddings, and/v1/images/generationsacross compatible providers. - Audio: transcription, translation, and speech —
/v1/audio/transcriptions,/v1/audio/translations, and/v1/audio/speechbring speech-to-text and text-to-speech workloads through the gateway. - OpenAI Responses API and multimodal inputs —
/v1/responsesis supported, including on Azure OpenAI backends, and chat requests accept image,audio_url, andvideo_urlcontent parts for compatible backends.
MCP Gateway
- Aggregate and route Model Context Protocol servers — Multiplex multiple MCP servers behind one endpoint with
MCPRoute, including tool routing and include/exclude filtering. - Fine-grained, CEL-based authorization — Enforce per-tool authorization.
tools/listapplies the same rules astools/call, so callers only discover the tools they are allowed to invoke. - Per-backend header forwarding with JWT claim projection — Forward selected request headers and project JWT claims to individual MCP backends.
Traffic Management & Multi-Tenancy
- Hostname-based multi-tenant routing — Serve different model sets per hostname from a single Gateway with
AIGatewayRoute.spec.hostnames; the/v1/modelsendpoint scopes its response to the matching host. - Token- and quota-aware rate limiting — Rate limit on model tokens and per
QuotaPolicy, with backend rate limit filter injection to enforce quota-based throttling. - Provider fallback and InferencePool support — Automatic failover across providers, plus intelligent endpoint selection for self-hosted models via the Gateway API Inference Extension.
Provider Authentication & Compliance
BackendSecurityPolicyfor upstream authentication — Centralize provider credentials with API key, AWS, Azure, and GCP cloud-native identity, including GKE Workload Identity via Application Default Credentials.- Request/response body redaction — Redact sensitive request and response bodies to meet compliance requirements.
Enterprise Observability
- OpenTelemetry tracing with OpenInference — Full request-lifecycle tracing, compatible with AI evaluation tools like Arize Phoenix.
- GenAI token metrics and reasoning-token accounting — Prometheus metrics for token usage, time-to-first-token, and inter-token latency, with separate accounting for reasoning tokens.
🔗 API Updates
- The
v1beta1API is now stable — v1.0 does not change the API surface. Instead it elevates the existingv1beta1CRDs to a stable contract under our support policy: no new apiVersion is introduced and no resource migration is required. New fields added during the 1.x series will remain backward compatible.
⚠️ Breaking Changes
None. v1.0 introduces no breaking changes. The v1beta1 API is unchanged — 1.0 declares it stable rather than altering it — so there is no apiVersion bump and no resource migration. If you are running v0.7, your existing resources work as-is.
🛡️ Support & Compatibility Policy
With 1.0, the project's support policy applies in full:
- API compatibility. The
v1beta1CRDs are stable for the 1.x series. New fields are added in a backward-compatible way; breaking changes are reserved for a future major version and would ship with a migration path. - Controller upgrades. Upgrading the controller will not break a valid configuration. Upgrade at most two minor versions at a time, following any documented migration steps.
- Envoy Gateway compatibility. Each release is built on the latest stable Envoy Gateway (and therefore Envoy Proxy); keep Envoy Gateway up to date before upgrading Envoy AI Gateway.
- End of life. A release is supported until two releases after it, consistent with prior versions.
📖 Upgrade Guidance
Upgrading from v0.7 is a drop-in change — there are no API or resource changes:
- Update the Helm chart / controller image to the v1.0.0 release.
- Roll out as usual. Your existing
v1beta1resources require no edits.
If you are on an older release, upgrade one or two minor versions at a time and follow the migration steps in each series' release notes (notably the v0.6 promotion of the core CRDs to v1beta1) before moving to 1.0.
📦 Dependency Versions
| Dependency | Version |
|---|---|
| Go | 1.26.4 |
| Envoy Gateway | v1.8.1 |
| Envoy Proxy | v1.38.1 |
| Gateway API | v1.5.1 |
| Gateway API Inference Extension | v1.0.2 |
| MCP Go SDK | v1.6.1 |
🙏 Acknowledgements
1.0 belongs to everyone who got us here. Our deepest thanks to:
- The maintainers across Tetrate, Bloomberg, Tencent, and Nutanix, and the many independent contributors who shaped the project through code, reviews, and weekly community meetings.
- The early adopters — including Bloomberg, LY Corporation, Alan by Comma Soft, and NRP — who ran Envoy AI Gateway in production and fed back what mattered.
- The broader Gateway API, Envoy, and CNCF communities whose standards this project is built on.
🔮 What's Next
A stable API is a starting line, not a finish line. On the roadmap:
- A dedicated
MCPBackendCRD, decoupling MCP backend configuration fromMCPRoute. - Deeper MCP authorization across tools, resources, and prompts.
- Fuller quota-aware routing that automatically steers around rate-limited upstreams.
- More provider translation paths and expanded multimodal support.