github envoyproxy/ai-gateway v1.0.0

7 hours ago

Envoy AI Gateway v1.0.0 — General Availability

Envoy AI Gateway v1.0.0 marks General Availability. With this release the core control-plane API — AIGatewayRoute, AIServiceBackend, BackendSecurityPolicy, GatewayConfig, and MCPRoute, all served at v1beta1 — is declared stable: within the 1.x series we will not make breaking changes to it unless required by a critical security fix, and any such change will ship with a documented migration path. Upgrading from v0.7 requires no changes to your resources. 1.0 brings together everything built since the first release in February 2025: a single OpenAI-compatible API across 16 providers with cross-provider translation, a full Model Context Protocol gateway, multimodal and audio endpoints, enterprise-grade observability, and multi-tenant, quota-aware routing — all as an additive layer on CNCF Envoy Gateway.

🎉 What 1.0 Means

1.0 is a commitment, not just another feature release. From the first release we said the major version would arrive once we had a first stable control-plane API. That moment is here. General Availability means:

  • A stable API. Your v1beta1 resources will not break under you within the 1.x series.
  • Predictable upgrades. Upgrading the controller will not break a valid, migrated configuration; any change requiring action ships with a documented path.
  • A complete platform. Everything assembled since v0.1 is now production-ready on the proven Envoy Gateway foundation.

Our API stability commitment

For stable releases, we will never break the APIs unless there is a critical security issue, and we will always provide a migration path in the release notes if we ever must. Following Semantic Versioning, the v1beta1 control-plane API remains backward compatible for the entire 1.x series — breaking changes would only ever land in a future 2.0. See the full support policy.

✨ The 1.0 Feature Surface

These are the capabilities the stable 1.0 control plane brings together.

A Stable, Versioned Control-Plane API

  • Core CRDs now covered by the stability guarantee — The control-plane API you build on — AIGatewayRoute, AIServiceBackend, BackendSecurityPolicy, GatewayConfig, and MCPRoute, all served at v1beta1 — is now a stable contract. Within the 1.x series these APIs will not change in a breaking way unless required by a critical security fix, and any such change will ship with a documented migration path.

Universal LLM Access

  • One OpenAI-compatible API across 16 providers — Reach OpenAI, Azure OpenAI, Google Gemini, Google Vertex AI, AWS Bedrock, Anthropic, Mistral, Cohere, Groq, Together AI, DeepInfra, DeepSeek, Hunyuan, SambaNova, Grok, and the Tetrate Agent Router Service through a single endpoint. Switch or mix providers without changing client code.
  • Cross-provider request/response translation — Translate between provider protocols transparently — Anthropic /v1/messages to OpenAI /v1/chat/completions, and Anthropic Messages to AWS Bedrock Converse and InvokeModel — including streaming, tool use, reasoning/thinking blocks, and images.
  • Model virtualization with modelNameOverride — Expose stable, application-facing model names while the gateway maps them to provider-specific models, enabling A/B testing, gradual migrations, and multi-provider strategies without touching client code.

Full Endpoint Coverage

  • Chat, completions, embeddings, and images/v1/chat/completions, /v1/completions, /v1/embeddings, and /v1/images/generations across compatible providers.
  • Audio: transcription, translation, and speech/v1/audio/transcriptions, /v1/audio/translations, and /v1/audio/speech bring speech-to-text and text-to-speech workloads through the gateway.
  • OpenAI Responses API and multimodal inputs/v1/responses is supported, including on Azure OpenAI backends, and chat requests accept image, audio_url, and video_url content parts for compatible backends.

MCP Gateway

  • Aggregate and route Model Context Protocol servers — Multiplex multiple MCP servers behind one endpoint with MCPRoute, including tool routing and include/exclude filtering.
  • Fine-grained, CEL-based authorization — Enforce per-tool authorization. tools/list applies the same rules as tools/call, so callers only discover the tools they are allowed to invoke.
  • Per-backend header forwarding with JWT claim projection — Forward selected request headers and project JWT claims to individual MCP backends.

Traffic Management & Multi-Tenancy

  • Hostname-based multi-tenant routing — Serve different model sets per hostname from a single Gateway with AIGatewayRoute.spec.hostnames; the /v1/models endpoint scopes its response to the matching host.
  • Token- and quota-aware rate limiting — Rate limit on model tokens and per QuotaPolicy, with backend rate limit filter injection to enforce quota-based throttling.
  • Provider fallback and InferencePool support — Automatic failover across providers, plus intelligent endpoint selection for self-hosted models via the Gateway API Inference Extension.

Provider Authentication & Compliance

  • BackendSecurityPolicy for upstream authentication — Centralize provider credentials with API key, AWS, Azure, and GCP cloud-native identity, including GKE Workload Identity via Application Default Credentials.
  • Request/response body redaction — Redact sensitive request and response bodies to meet compliance requirements.

Enterprise Observability

  • OpenTelemetry tracing with OpenInference — Full request-lifecycle tracing, compatible with AI evaluation tools like Arize Phoenix.
  • GenAI token metrics and reasoning-token accounting — Prometheus metrics for token usage, time-to-first-token, and inter-token latency, with separate accounting for reasoning tokens.

🔗 API Updates

  • The v1beta1 API is now stable — v1.0 does not change the API surface. Instead it elevates the existing v1beta1 CRDs to a stable contract under our support policy: no new apiVersion is introduced and no resource migration is required. New fields added during the 1.x series will remain backward compatible.

⚠️ Breaking Changes

None. v1.0 introduces no breaking changes. The v1beta1 API is unchanged — 1.0 declares it stable rather than altering it — so there is no apiVersion bump and no resource migration. If you are running v0.7, your existing resources work as-is.

🛡️ Support & Compatibility Policy

With 1.0, the project's support policy applies in full:

  • API compatibility. The v1beta1 CRDs are stable for the 1.x series. New fields are added in a backward-compatible way; breaking changes are reserved for a future major version and would ship with a migration path.
  • Controller upgrades. Upgrading the controller will not break a valid configuration. Upgrade at most two minor versions at a time, following any documented migration steps.
  • Envoy Gateway compatibility. Each release is built on the latest stable Envoy Gateway (and therefore Envoy Proxy); keep Envoy Gateway up to date before upgrading Envoy AI Gateway.
  • End of life. A release is supported until two releases after it, consistent with prior versions.

📖 Upgrade Guidance

Upgrading from v0.7 is a drop-in change — there are no API or resource changes:

  1. Update the Helm chart / controller image to the v1.0.0 release.
  2. Roll out as usual. Your existing v1beta1 resources require no edits.

If you are on an older release, upgrade one or two minor versions at a time and follow the migration steps in each series' release notes (notably the v0.6 promotion of the core CRDs to v1beta1) before moving to 1.0.

📦 Dependency Versions

Dependency Version
Go 1.26.4
Envoy Gateway v1.8.1
Envoy Proxy v1.38.1
Gateway API v1.5.1
Gateway API Inference Extension v1.0.2
MCP Go SDK v1.6.1

🙏 Acknowledgements

1.0 belongs to everyone who got us here. Our deepest thanks to:

  • The maintainers across Tetrate, Bloomberg, Tencent, and Nutanix, and the many independent contributors who shaped the project through code, reviews, and weekly community meetings.
  • The early adopters — including Bloomberg, LY Corporation, Alan by Comma Soft, and NRP — who ran Envoy AI Gateway in production and fed back what mattered.
  • The broader Gateway API, Envoy, and CNCF communities whose standards this project is built on.

🔮 What's Next

A stable API is a starting line, not a finish line. On the roadmap:

  • A dedicated MCPBackend CRD, decoupling MCP backend configuration from MCPRoute.
  • Deeper MCP authorization across tools, resources, and prompts.
  • Fuller quota-aware routing that automatically steers around rate-limited upstreams.
  • More provider translation paths and expanded multimodal support.

Don't miss a new ai-gateway release

NewReleases is sending notifications on new releases.