github langwatch/langwatch langwatch@v2.0.0
langwatch: v2.0.0

latest releases: langwatch@v2.0.2, langwatch@v2.0.1
8 hours ago

2.0.0 (2026-01-28)

Features

  • add AI scenario generation (#1110) (7da469d)
  • add CI/CD execution support for evaluations v3 (#1118) (d28adac)
  • add COSS licensing enforcement for self-hosted deployments (#1170) (37c30ec)
  • add http agent (#1053) (02284be)
  • add link to setup evaluations from sdk (2130e30)
  • add orchestrator pattern for Claude Code context management (#1163) (7b3415b)
  • analytics: track onboarding progress metrics in PostHog (1533de5)
  • claude: add rogerio-cto-review agent and worktree command (#1192) (326196a)
  • claude: add workflow commands for worktrees and PR review (#1135) (8643e92)
  • clickhouse trace filtering (#1079) (12f4b03)
  • dev: add Docker Compose dev environment with profiles (#1188) (72e8df5)
  • evaluations v3 execution and new evaluations results page (#1113) (510f65d)
  • evaluations-v3: add lambda warmup for faster evaluation runs (cc95cca)
  • evaluations-v3: implement HTTP agent support (#1196) (7afb24e)
  • evaluations-v3: improve table column resizing and overflow handling (d1d3831)
  • evaluations-v3: major table performance improvements, prompts to experiment button and other bugfixes (#1181) (2cbf430)
  • evaluations-v3: support evaluators/{id} path for database evaluators (2f65327)
  • evaluators: add "Use via API" dialog with code snippets (58ccaf5)
  • event sourcing powered evaluations (#1090) (fd9898e)
  • improve trace/span event sourcing pipeline (#980) (d67854d)
  • integrate HTTP agent into scenario/simulations quick run (#1071) (3e3a8d4)
  • introduce first step towards dark mode (#1143) (426d776)
  • licensing: add centralized license enforcement with resource limits (#1208) (a511233)
  • llm-config: upgrade model registry with dynamic parameters and OpenRouter sync (#1115) (f03a283)
  • new online evaluations and guardrails setup (#1151) (7c8a804)
  • new simulation card design (#1106) (3a116af)
  • projects: add drawer-based project creation (#1068) (5620034)
  • prompts: show icon-only buttons with tooltips in compare mode (4f4ecfe)
  • refactor model providers UI to drawer-based (#1050) (8c8df73)
  • regenerate api key (#1083) (e09bf3f)
  • scenarios: add help text and tooltips to scenario form fields (#1128) (fec3e73)
  • sdk: add online evaluations API and ensureSetup for TypeScript (2209258)
  • traces: add reasoning tokens and effort support for LLM models (16f1d4a)
  • track events as spans for REST API (#1089) (ec8243e)
  • ui: revamp LLM parameter controls with button-based selects (9a42d93)
  • update onboarding for new go sdk shape (#1225) (ae6b6a2)
  • use programmatic langwatch config in scenario runner (#1074) (34a9d62)
  • walking skeleton for scenarios (#1047) (f6acbb8)

Bug Fixes

  • add vendor folder before installation to fix docker build (292fe83)
  • add z-index to tooltip (#1078) (1804329)
  • annotation highlight scroll (#1073) (7e3471d)
  • base64 markdown rendering (8017548)
  • check if graph exists (#1067) (eef4089)
  • ci: add pnpm-lock.yaml for agentic-e2e-tests (#1216) (761a1a0)
  • clickhouse replication issue with goose migrations + tables not replicating correctly (#1116) (db6638f)
  • cluster goose db (#1140) (2cc0e69)
  • config: disable HSTS and upgrade-insecure-requests in development (#1149) (f88086e)
  • dspy: capture full message output including reasoning_content (8257cae)
  • elasticsearch migrations for batch evals for new target fields (530bb73)
  • evaluations-v3: display Code Agent outputs with custom field names (#1226) (9a69c53)
  • evaluations-v3: fix type errors in httpAgentUtils and dslAdapter (3196cc7)
  • evaluations-v3: pass all LLM params including reasoning to targets (c786a73)
  • evaluations-v3: persist all LLM parameters in local prompt config (4b87561)
  • evaluations-v3: prevent autosave data loss on back navigation (335e571)
  • event sourcing improvements from testing (#1109) (2a400db)
  • fix emojis without breaking multiline prompt evaluators anymore (2d47925)
  • fix failing unit tests (9f9ad87)
  • goose migrate missing priming row (#1145) (5698c57)
  • goose migration directory was wrong in dockerfile (#1105) (ecd620b)
  • improve dedupe logic, and fix span dropping issue in span storage event handler (#1201) (3b43fae)
  • improve locking contention delay config and error handling (#1171) (5d84748)
  • light mode token changes + hide theme selector if no feature flag (#1152) (6729925)
  • litellm: fix Anthropic model integration issues (#1197) (1ed2c7f)
  • llm-config: smart max_tokens handling on model switch (7513131)
  • make otlp validation and parsing less strict, to support more otlp protocol versions (#1148) (dc1e1eb)
  • navigation to the same drawer url, get the trace id button on the conversation working again (f906f56)
  • normalize otlp ids to guaranteed otel ids (#1164) (2e54acb)
  • normalize span IDs to hex strings before BullMQ queue (2e54acb)
  • onboarding: prevent model provider credential inputs from resetting (#1060) (ca8b8ee)
  • prompts: default maxTokens to undefined for model-based defaults (4a36aee)
  • prompts: show Bedrock models in model selector dropdown (#1206) (2e49e01)
  • prompts: structured outputs with custom field names and types (#1112) (d1c0370)
  • prompts: use model's actual max_tokens for new prompts (5aaa234)
  • proper terminology on analytics and add linking button for the graph (15e3c2c)
  • properly handle clickhouse engine tag macros for replicated cluster configs (#1111) (6052374)
  • python-sdk: resolve name collision between Evaluation TypedDict and class alias (873909a)
  • react imports on deja view (#1160) (6514a1a)
  • remove duplicate evaluations unit test (already in integration) (#1177) (8ed9d28)
  • rework pie/donut data and colours (#1055) (8d50910)
  • scenario editor UX improvements and bug fixes (#1086) (1d44f72)
  • set correct ksuid environment in worker (#1173) (10ec064)
  • small project drawer title fix, make + Add clickable (9746fdb)
  • tests: align license router tests with RBAC middleware behavior (#1207) (1def54d)
  • tests: normalize column IDs to names in orchestrator integration test (dc7c2ea)
  • unit tests and typecheck (802ccc1)
  • various evaluations v3 fixes (#1122) (c9904fc)

Miscellaneous

  • ✨ new readme preview video 💅🏼 (#1036) (ba949c5)
  • eval pagination footer (#1044) (aaea14f)
  • fix all biome lint issues (#1121) (d83bb6e)
  • improve stressed+blessed event sourcing tooling (#1108) (82ccab6)
  • main: release python-sdk 0.10.0 (#1142) (749a977)
  • main: release python-sdk 0.9.0 (#1114) (0f24551)
  • migrate Cursor config to Claude Code system (#1147) (fc20384)
  • remove litellm enterprise deps, add license file generation (792243a)
  • standardize top-level rules on AGENTS.md, remove duplicate CLAUDE.md (#1150) (43ba172)
  • sync model registry (43bb203)
  • sync model registry (363 models) (#1138) (43bb203)
  • update where goose migration db is stored + improve handling (#1141) (e9265ed)

Documentation

  • add Repository + Service pattern documentation (#1190) (fa6a81e)
  • extract design principles from PR #1025 into searchable documentation (#1139) (41e57d2)
  • improve Claude Code agent configuration and BDD workflow (#1189) (e2a1e2d)
  • move TESTING.md to docs/TESTING_PHILOSOPHY.md (#1157) (c475c86)
  • standardize worktree and branch naming conventions (#1211) (c3ef006)

Code Refactoring

  • extract SSRF protection utils and remove cruft files (#1065) (12ece04)
  • sdk: fix evaluation API naming consistency (3aef2a3)
  • sdk: rename evaluation API to experiment for new terminology (f10326c)
  • sdk: rename internal evaluation classes to experiment (ff70cab)

Don't miss a new langwatch release

NewReleases is sending notifications on new releases.