langwatch/langwatch langwatch@v2.0.0 on GitHub

2.0.0 (2026-01-28)

Features

add AI scenario generation (#1110) (7da469d)
add CI/CD execution support for evaluations v3 (#1118) (d28adac)
add COSS licensing enforcement for self-hosted deployments (#1170) (37c30ec)
add http agent (#1053) (02284be)
add link to setup evaluations from sdk (2130e30)
add orchestrator pattern for Claude Code context management (#1163) (7b3415b)
analytics: track onboarding progress metrics in PostHog (1533de5)
claude: add rogerio-cto-review agent and worktree command (#1192) (326196a)
claude: add workflow commands for worktrees and PR review (#1135) (8643e92)
clickhouse trace filtering (#1079) (12f4b03)
dev: add Docker Compose dev environment with profiles (#1188) (72e8df5)
evaluations v3 execution and new evaluations results page (#1113) (510f65d)
evaluations-v3: add lambda warmup for faster evaluation runs (cc95cca)
evaluations-v3: implement HTTP agent support (#1196) (7afb24e)
evaluations-v3: improve table column resizing and overflow handling (d1d3831)
evaluations-v3: major table performance improvements, prompts to experiment button and other bugfixes (#1181) (2cbf430)
evaluations-v3: support evaluators/{id} path for database evaluators (2f65327)
evaluators: add "Use via API" dialog with code snippets (58ccaf5)
event sourcing powered evaluations (#1090) (fd9898e)
improve trace/span event sourcing pipeline (#980) (d67854d)
integrate HTTP agent into scenario/simulations quick run (#1071) (3e3a8d4)
introduce first step towards dark mode (#1143) (426d776)
licensing: add centralized license enforcement with resource limits (#1208) (a511233)
llm-config: upgrade model registry with dynamic parameters and OpenRouter sync (#1115) (f03a283)
new online evaluations and guardrails setup (#1151) (7c8a804)
new simulation card design (#1106) (3a116af)
projects: add drawer-based project creation (#1068) (5620034)
prompts: show icon-only buttons with tooltips in compare mode (4f4ecfe)
refactor model providers UI to drawer-based (#1050) (8c8df73)
regenerate api key (#1083) (e09bf3f)
scenarios: add help text and tooltips to scenario form fields (#1128) (fec3e73)
sdk: add online evaluations API and ensureSetup for TypeScript (2209258)
traces: add reasoning tokens and effort support for LLM models (16f1d4a)
track events as spans for REST API (#1089) (ec8243e)
ui: revamp LLM parameter controls with button-based selects (9a42d93)
update onboarding for new go sdk shape (#1225) (ae6b6a2)
use programmatic langwatch config in scenario runner (#1074) (34a9d62)
walking skeleton for scenarios (#1047) (f6acbb8)

Bug Fixes

add vendor folder before installation to fix docker build (292fe83)
add z-index to tooltip (#1078) (1804329)
annotation highlight scroll (#1073) (7e3471d)
base64 markdown rendering (8017548)
check if graph exists (#1067) (eef4089)
ci: add pnpm-lock.yaml for agentic-e2e-tests (#1216) (761a1a0)
clickhouse replication issue with goose migrations + tables not replicating correctly (#1116) (db6638f)
cluster goose db (#1140) (2cc0e69)
config: disable HSTS and upgrade-insecure-requests in development (#1149) (f88086e)
dspy: capture full message output including reasoning_content (8257cae)
elasticsearch migrations for batch evals for new target fields (530bb73)
evaluations-v3: display Code Agent outputs with custom field names (#1226) (9a69c53)
evaluations-v3: fix type errors in httpAgentUtils and dslAdapter (3196cc7)
evaluations-v3: pass all LLM params including reasoning to targets (c786a73)
evaluations-v3: persist all LLM parameters in local prompt config (4b87561)
evaluations-v3: prevent autosave data loss on back navigation (335e571)
event sourcing improvements from testing (#1109) (2a400db)
fix emojis without breaking multiline prompt evaluators anymore (2d47925)
fix failing unit tests (9f9ad87)
goose migrate missing priming row (#1145) (5698c57)
goose migration directory was wrong in dockerfile (#1105) (ecd620b)
improve dedupe logic, and fix span dropping issue in span storage event handler (#1201) (3b43fae)
improve locking contention delay config and error handling (#1171) (5d84748)
light mode token changes + hide theme selector if no feature flag (#1152) (6729925)
litellm: fix Anthropic model integration issues (#1197) (1ed2c7f)
llm-config: smart max_tokens handling on model switch (7513131)
make otlp validation and parsing less strict, to support more otlp protocol versions (#1148) (dc1e1eb)
navigation to the same drawer url, get the trace id button on the conversation working again (f906f56)
normalize otlp ids to guaranteed otel ids (#1164) (2e54acb)
normalize span IDs to hex strings before BullMQ queue (2e54acb)
onboarding: prevent model provider credential inputs from resetting (#1060) (ca8b8ee)
prompts: default maxTokens to undefined for model-based defaults (4a36aee)
prompts: show Bedrock models in model selector dropdown (#1206) (2e49e01)
prompts: structured outputs with custom field names and types (#1112) (d1c0370)
prompts: use model's actual max_tokens for new prompts (5aaa234)
proper terminology on analytics and add linking button for the graph (15e3c2c)
properly handle clickhouse engine tag macros for replicated cluster configs (#1111) (6052374)
python-sdk: resolve name collision between Evaluation TypedDict and class alias (873909a)
react imports on deja view (#1160) (6514a1a)
remove duplicate evaluations unit test (already in integration) (#1177) (8ed9d28)
rework pie/donut data and colours (#1055) (8d50910)
scenario editor UX improvements and bug fixes (#1086) (1d44f72)
set correct ksuid environment in worker (#1173) (10ec064)
small project drawer title fix, make + Add clickable (9746fdb)
tests: align license router tests with RBAC middleware behavior (#1207) (1def54d)
tests: normalize column IDs to names in orchestrator integration test (dc7c2ea)
unit tests and typecheck (802ccc1)
various evaluations v3 fixes (#1122) (c9904fc)

Miscellaneous

✨ new readme preview video 💅🏼 (#1036) (ba949c5)
eval pagination footer (#1044) (aaea14f)
fix all biome lint issues (#1121) (d83bb6e)
improve stressed+blessed event sourcing tooling (#1108) (82ccab6)
main: release python-sdk 0.10.0 (#1142) (749a977)
main: release python-sdk 0.9.0 (#1114) (0f24551)
migrate Cursor config to Claude Code system (#1147) (fc20384)
remove litellm enterprise deps, add license file generation (792243a)
standardize top-level rules on AGENTS.md, remove duplicate CLAUDE.md (#1150) (43ba172)
sync model registry (43bb203)
sync model registry (363 models) (#1138) (43bb203)
update where goose migration db is stored + improve handling (#1141) (e9265ed)

Documentation

add Repository + Service pattern documentation (#1190) (fa6a81e)
extract design principles from PR #1025 into searchable documentation (#1139) (41e57d2)
improve Claude Code agent configuration and BDD workflow (#1189) (e2a1e2d)
move TESTING.md to docs/TESTING_PHILOSOPHY.md (#1157) (c475c86)
standardize worktree and branch naming conventions (#1211) (c3ef006)

Code Refactoring

extract SSRF protection utils and remove cruft files (#1065) (12ece04)
sdk: fix evaluation API naming consistency (3aef2a3)
sdk: rename evaluation API to experiment for new terminology (f10326c)
sdk: rename internal evaluation classes to experiment (ff70cab)

langwatch/langwatch langwatch@v2.0.0 langwatch: v2.0.0 on GitHub

2.0.0 (2026-01-28)

Features

Bug Fixes

Miscellaneous

Documentation

Code Refactoring

langwatch/langwatch langwatch@v2.0.0
langwatch: v2.0.0

on GitHub