mastra-ai/mastra @mastra/core@1.46.0 on GitHub

Highlights

Multi-session Harness architecture (Session-first APIs)

Harness is now a pure factory/shared-resource owner: create isolated sessions via harness.createSession() (get-or-create by resourceId), with run control, state, event bus, model/mode switching, permissions, OM, subagent settings, and thread lifecycle all moved onto Session domains (e.g. session.sendMessage(), session.thread.*, session.model.switch(), session.subscribe()), enabling safe concurrent multi-user/multi-thread hosting.

Drive Harness sessions over HTTP (and from the JS client)

Registered harnesses on Mastra can now be operated via new harness-scoped server routes (send/steer/abort/approve tool calls, manage threads, read state, and subscribe via SSE), and @mastra/client-js adds a first-class harness resource (client.getHarness(id).session(resourceId)) to control sessions remotely.

Cross-process signal delivery with distributed leasing + new `accepted` contract

Signal/message APIs now return an accepted promise resolving to a discriminated routing decision (wake/deliver/persist/discard) with authoritative runId when applicable, and introduce a LeaseProvider abstraction (in-memory and Redis Streams implementations) to ensure only one process owns/wakes a thread run in multi-instance/serverless deployments.

Deterministic agent experiments with item-level tool mocks + multi-tenant datasets/experiments

Dataset items can now declare ordered toolMocks (with strict/ignore arg matching) and experiments return a toolMockReport, enabling deterministic replay without running real tools; datasets and experiments also gain optional (organizationId, projectId) tenancy scoping plus dataset-level candidate identity (candidateKey, candidateId) and a new dataset item source type 'candidate-screener'.

New integrations & streaming ergonomics

New packages add major deployment/integration options: @mastra/next (Next.js App Router adapter), @mastra/tanstack-start (TanStack Start adapter), and @mastra/archil (Archil filesystem provider for workspaces); streaming gains a unified untilIdle option on DurableAgent.stream() and InngestAgent.stream() to keep streams open through background task continuations.

Breaking Changes

Harness no longer exposes a singleton harness.session; callers must use await harness.createSession() and operate on the returned Session.
Run control moved from Harness.* to Session (e.g. sendMessage, sendSignal, abort, respondToToolSuspension, etc. removed from Harness).
Event subscription moved from Harness.subscribe() to session.subscribe() (session-scoped event isolation).
Thread lifecycle APIs moved to session.thread.*; Harness no longer exposes createThread/switchThread/cloneThread/renameThread/detachFromCurrentThread or harness.memory.
Model/mode switching moved to session.model.switch() and session.mode.switch(); OM accessors moved to session.om.*; permissions to session.permissions.*; subagent model accessors to session.subagents.model.*.
Deprecated Harness.getState() / Harness.setState() compatibility wrappers removed (use session.state.get() / session.state.set()).

Changelog

@mastra/core@1.46.0

Minor Changes

Removed the deprecated Harness.getState() and Harness.setState() compatibility wrappers, along with the unused private updateState. Harness state has lived on the session for a while; these were thin proxies marked @deprecated. (#18200)
Before
```
const state = harness.getState();
await harness.setState({ count: 1 });
```
After
```
const state = harness.session.state.get();
await harness.session.state.set({ count: 1 });
```
This does not affect the tool-facing harness context, which continues to expose state / getState / setState / updateState alongside session.state.
mastracode is updated to set browser settings via session.state.set().
Moved the observational-memory model accessors off the Harness onto session.om. Reading and switching the observer/reflector models and reading observation/reflection thresholds now live on the session, next to the state they read and write. (#18200)
Before
```
const observer = harness.getObserverModelId();
await harness.switchObserverModel({ modelId: 'openai/gpt-4o' });
```
After
```
const observer = harness.session.om.observer.modelId();
await harness.session.om.observer.switchModel({ modelId: 'openai/gpt-4o' });
```
The accessors are grouped by role under session.om.observer and session.om.reflector, each exposing modelId(), threshold(), resolvedModel(), and switchModel({ modelId }).
Removed Harness.getObserverModelId, getReflectorModelId, getObservationThreshold, getReflectionThreshold, getResolvedObserverModel, getResolvedReflectorModel, switchObserverModel, and switchReflectorModel.
mastracode is updated to consume the new API: the /om command and status line now read and switch observer/reflector models via session.om.
Moved the run-control surface off the Harness onto the Session. Sending messages and signals, steering, following up, aborting, responding to tool suspensions, and saving system reminders now live on the session that owns the run state they drive, instead of being delegated through the Harness. This is the final step of the single-session extraction series and a prerequisite for the upcoming multi-session (createSession) work: every per-session operation now lives on Session, while the Harness retains only genuinely shared machinery (agent, config builders, storage/lock gateway), which it injects into each session via the SessionMachinery provider. (#18213)
Before
```
await harness.sendMessage({ content: 'hello' });
harness.sendSignal({ content: 'steer the run' });
harness.abort();
await harness.respondToToolSuspension({ toolCallId, approved: true });
```
After
```
const session = await harness.createSession();
await session.sendMessage({ content: 'hello' });
session.sendSignal({ content: 'steer the run' });
session.abort();
await session.respondToToolSuspension({ toolCallId, approved: true });
```
Removed Harness.sendMessage, Harness.sendSignal, Harness.sendNotificationSignal, Harness.steer, Harness.followUp, Harness.abort, Harness.respondToToolSuspension, Harness.saveSystemReminderMessage, and Harness.waitForCurrentThreadStreamIdle. The Session reaches Harness-owned machinery through the injected SessionMachinery provider, so the heavy run loop is still constructed and owned by the Harness while being parameterized by the session it runs on.
mastracode is updated to consume the new API: the TUI run loop, slash-command dispatch, goal lifecycle, prompt handlers, and headless entry points all drive run-control through the session returned by harness.createSession().
The Harness event bus now lives on the Session. Each Session owns its own listeners and emit pipeline (session.subscribe() / internal session.emit()), so events emitted on one session are delivered only to that session's subscribers — never to another session's. This is the isolation foundation for serving a single Harness to multiple concurrent sessions (e.g. one Harness backing many channel threads). (#18213)
Breaking (Harness is under active development): Harness.subscribe() is removed. Subscribe on the session instead:
```
- harness.subscribe(listener)
+ harness.session.subscribe(listener)
```
Session subsystems (mode/model/om/permissions/subagents/state) no longer receive an injected emit callback — they emit directly to their session's bus. mastracode is updated to subscribe via harness.session.subscribe().
Add multi-tenant filtering and candidate identity to the datasets domain. (#18314)
DatasetRecord, DatasetItem, DatasetItemRow, CreateDatasetInput, and the filters on ListDatasetsInput / ListDatasetItemsInput now expose optional organizationId and projectId, matching the per-row tenancy contract already used by the observability domain. Dataset items inherit tenancy from their parent dataset automatically — they cannot be set per-call.
DatasetRecord and CreateDatasetInput also gain two new optional identity fields, candidateKey and candidateId, for use cases that need a stable per-incident identity at the dataset level (such as auto-materialized candidate datasets).
The DatasetItemSource['type'] union now includes 'candidate-screener' so externally-materialized items can be distinguished from user-uploaded ones.
DATASETS_SCHEMA and DATASET_ITEMS_SCHEMA gain matching nullable columns, and DatasetsInMemory persists and filters on them.
DatasetsManager.create() accepts the new optional fields, and DatasetsManager.list() accepts an optional filters arg that forwards to the storage layer.
Before
```
const dataset = await storage.createDataset({ name: 'goldens/checkout' });
const items = await storage.listDatasets({ pagination: { page: 0, perPage: 20 } });
```
After
```
const dataset = await storage.createDataset({
  name: 'candidates/missing-tool-call/incident-123',
  organizationId: 'org_abc',
  projectId: 'project_xyz',
  candidateKey: 'missing-tool-call',
  candidateId: 'incident-123',
});

const items = await storage.listDatasets({
  pagination: { page: 0, perPage: 20 },
  filters: { organizationId: 'org_abc', projectId: 'project_xyz' },
});
```

Added untilIdle option to DurableAgent.stream() — pass untilIdle: true or { maxIdleMs } to keep the stream open across background-task continuations. This is the same behavior as the now-deprecated streamUntilIdle() method, matching the consolidation done for the non-durable Agent in #17536. (#18349)

// Before (deprecated)
const result = await durableAgent.streamUntilIdle('Research topic', {
  memory: { thread: 't1', resource: 'u1' },
});

// After
const result = await durableAgent.stream('Research topic', {
  untilIdle: true,
  memory: { thread: 't1', resource: 'u1' },
});

Replaced Harness.switchModel() with harness.session.model.switch(). Model switching now lives on the session, alongside the active mode/model state it already owns. (#18197)
Before
```
await harness.switchModel({ modelId: 'openai/gpt-5' });
```
After
```
await harness.session.model.switch({ modelId: 'openai/gpt-5' });
```
Added multi-tenant scoping columns (organizationId, projectId) to the experiments domain so experiment records and per-item results inherit the tenancy bucket of their parent dataset. (#18388)
Experiment, ExperimentResult, CreateExperimentInput, and AddExperimentResultInput now carry optional organizationId / projectId fields. ListExperimentsInput and ListExperimentResultsInput gain a filters: ExperimentTenancyFilters block (mirrors DatasetTenancyFilters) for scoping queries within a (organizationId, projectId) bucket. Tenancy is hydrated from the parent dataset on createExperiment and denormalized onto each ExperimentResult for efficient tenancy-scoped queries.
The corresponding columns are also added to the mastra_experiments and mastra_experiment_results table schemas. Existing rows backfill to null, matching the rest of the dataset-tenancy surface.
This release also clarifies the targetType contract via JSDoc:
- CreateDatasetInput.targetType remains optional. Datasets without a TargetType are not experiment-eligible — the experiment runner requires a non-null CreateExperimentInput.targetType to resolve an executor.
- Experiment.targetType / CreateExperimentInput.targetType stay required. An experiment by definition replays inputs against a specific target.
No behavior change for existing OSS-created experiments; the new fields are additive and optional.
Example:
```
// Create an experiment scoped to a tenancy bucket. When the parent dataset
// already carries `organizationId` / `projectId`, `runExperiment` hydrates
// these fields automatically from the dataset record.
const experiment = await storage.createExperiment({
  name: 'qa-regression',
  datasetId: 'ds_123',
  datasetVersion: 1,
  targetType: 'agent',
  targetId: 'agent_qa',
  totalItems: 10,
  organizationId: 'org_123',
  projectId: 'proj_123',
});

// List experiments within a tenancy bucket.
const experiments = await storage.listExperiments({
  pagination: { page: 0, perPage: 20 },
  filters: { organizationId: 'org_123', projectId: 'proj_123' },
});

// List per-item results within the same bucket.
const results = await storage.listExperimentResults({
  experimentId: experiment.id,
  pagination: { page: 0, perPage: 50 },
  filters: { organizationId: 'org_123', projectId: 'proj_123' },
});
```
Replaced Harness.switchMode() with harness.session.mode.switch(). Switching modes now lives on the session, alongside the active mode/model state it already owns. (#18197)
Before
```
await harness.switchMode({ modeId: 'build' });
```
After
```
await harness.session.mode.switch({ modeId: 'build' });
```
mastracode is updated to consume the new API: the TUI and headless callers now invoke harness.session.mode.switch() instead of the removed harness.switchMode().
Made the Harness a pure factory + shared-resource owner and removed its singleton session. The Harness no longer holds a #session field or exposes a harness.session getter; instead, callers create fully isolated sessions via harness.createSession(). Each session owns its own mode, model, state, thread, run-control, event bus, and stream engine, so a single Harness can now serve many concurrent sessions (e.g. one per user/thread in a server or channel adapter) without cross-session state or event leakage. (#18213)
harness.createSession({ resourceId? }) constructs and wires a new Session, replays the current workspace status onto it, and selects or creates its thread before returning. Harness methods that previously read the singleton session are now parameterized by an explicit session argument (setResourceId, getKnownResourceIds, getCurrentModelAuthStatus, loadOMProgress, getObservationalMemoryRecord, destroy). harness.init() is now idempotent, so repeated calls reuse the same initialization instead of rebuilding internal state.
Before
```
const harness = new Harness(config);
await harness.init();
const session = harness.session; // singleton
await session.sendMessage({ content: 'hello' });
```
After
```
const harness = new Harness(config);
await harness.init();
const session = await harness.createSession({ resourceId });
await session.sendMessage({ content: 'hello' });

// A second, fully isolated session from the same Harness:
const other = await harness.createSession({ resourceId: otherUser });
```
Removed harness.session, harness.getSession(), the singleton #session field, and the deprecated harness.subscribe/harness.emit/harness.memory delegators.
mastracode is updated to consume the new API: composition roots call createSession() once at startup and store the result on state.session, and all per-session operations flow through that session object.
Register Harness instances on Mastra. (#18355)
Pass harnesses to new Mastra({ harnesses }) (keyed like agents and workflows) and look them up with mastra.getHarness(key), mastra.getHarnessById(id), or mastra.listHarnesses(). A registered Harness shares the parent Mastra — its storage, agents, gateways, and observability — instead of building its own internal one, and is torn down with mastra.shutdown(). A standalone Harness is unchanged. This is the foundation for serving Harness sessions over HTTP.
```
const code = new Harness({ id: 'code', modes });
const mastra = new Mastra({ harnesses: { code }, storage });

mastra.getHarness('code') === code; // by registration key
code.getMastra() === mastra; // shares the parent Mastra and its storage
```
Move the thread lifecycle onto session.thread. Creating, switching, cloning, renaming, and deleting a thread — plus loading a thread's persisted settings and managing the agent subscription — now live on the session's thread domain (session.thread.create/switch/clone/rename/delete/loadMetadata/ensureSubscription/detachFromCurrent). The host's storage, thread lock, and clone primitives are injected behind an expanded ThreadDataStore gateway, so SessionThread owns the full lifecycle while the Harness owns only the DB. (#18213)
Before
```
await harness.createThread();
await harness.switchThread({ threadId });
```
After
```
await harness.session.thread.create();
await harness.session.thread.switch({ threadId });
```
Breaking (Harness is under development): the Harness no longer exposes createThread, switchThread, cloneThread, renameThread, detachFromCurrentThread, or the memory accessor.
Collapsed the result of agent.sendSignal / sendMessage / queueMessage / sendStateSignal / sendNotificationSignal into a single accepted promise that resolves at decision-time to a discriminated union: { action: 'wake'; runId; output } when the signal started a run in this process, { action: 'deliver'; runId } when it was forwarded onto an existing run, or { action: 'persist' } / { action: 'discard' } when nothing ran. runId is present only on wake/deliver; for persist/discard correlate via result.signal.id. accepted resolves for routing decisions and rejects only when the signal couldn't be routed at all (e.g. misconfigured agent). This replaces the old accepted: true boolean and the best-effort top-level runId. result.persisted stays top-level. sendNotificationSignal's accepted is optional (a notification may be dropped by policy with no signal) — read result.decision for the policy verdict. (#17723)
On serverless platforms, when this process/Lambda is the one that woke the run (action === 'wake'), it can use the platform's waitUntil to keep itself alive until the run it started completes — otherwise the runtime may be frozen or torn down mid-run:
```
const result = agent.sendSignal(signal, { resourceId, threadId });
ctx.waitUntil(
  result.accepted.then(async accepted => {
    if (accepted.action === 'wake') await accepted.output.consumeStream();
  }),
);
```
Added a LeaseProvider capability (acquireLease / releaseLease / renewLease / getLeaseOwner / transferLease) — distributed leasing kept separate from event delivery (PubSub) — so processes racing to wake the same thread coordinate a single owner. The winner runs the stream; losers forward their signal to it. EventEmitterPubSub leases in-memory; RedisStreamsPubSub uses SET NX PX with owner-verified Lua scripts for release, renew, and transfer; the default NoopLeaseProvider always wins, preserving single-process behavior.
Fixed a cross-process race where, after a run finished, the thread's lease briefly went free before its queued follow-up work started — letting another process win the lease and start a competing run for the same thread. The owner now keeps the lease until all queued work is drained. The lease TTL and renewal interval are overridable for tests via MASTRA_AGENT_THREAD_LEASE_TTL_MS and MASTRA_AGENT_THREAD_LEASE_RENEW_INTERVAL_MS (production defaults unchanged).
Made harness.createSession() get-or-create by resourceId and added harness.getSessionByResource() so notification delivery runs as the session that owns the target thread. (#18213)
A resourceId now maps to exactly one durable session per Harness: calling createSession({ resourceId }) twice returns the same session, so a user/thread always resumes their own session and the in-flight creation is shared by concurrent callers. This is the multi-session behavior a long-running / multiplayer server needs — work can be driven on a thread whether or not a human is currently attached, and it runs with that thread's own model/mode/state instead of an arbitrary session.
Before
```
// createSession was a pure factory: two calls for the same resource produced
// two independent sessions, and notification delivery had no way to find
// "the session that owns this resource".
const a = await harness.createSession({ resourceId: 'user-a' });
const b = await harness.createSession({ resourceId: 'user-a' }); // different object
```
After
```
const a = await harness.createSession({ resourceId: 'user-a' });
const b = await harness.createSession({ resourceId: 'user-a' }); // same session as a

const session = await harness.getSessionByResource('user-a'); // === a
```

Added agent-level skills: attach skills directly to an Agent without a Workspace via createSkill() and the new skills config property. (#18360)

New skills property on Agent config

import { Agent } from '@mastra/core/agent';
import { createSkill } from '@mastra/core/skills';

const agent = new Agent({
  id: 'reviewer',
  model: openai('gpt-4o'),
  instructions: 'You are a code review assistant.',
  skills: [
    './skills/review', // filesystem path
    createSkill({
      // inline — no filesystem needed
      name: 'release-checklist',
      description: 'Use when preparing a release.',
      instructions: '## Release Checklist\n1. Run tests...',
    }),
  ],
});

Key features:

createSkill() factory for code-defined skills with validation
Filesystem paths and inline skills can be mixed in the same array
Dynamic skill resolution via function: skills: (ctx) => [...]
When both skills and workspace.skills exist, they merge (agent-level wins on conflicts)
agent.getSkill(name) and agent.listSkills() public API for programmatic access
New @mastra/core/skills export path

Moved the tool-permission rule accessors off the Harness onto session.permissions. Reading and writing the persisted per-category / per-tool approval policies now lives on the session, next to the state it reads and writes. (#18200)
Before
```
const rules = harness.getPermissionRules();
harness.setPermissionForCategory({ category: 'execute', policy: 'ask' });
harness.setPermissionForTool({ toolName: 'dangerous_tool', policy: 'deny' });
```
After
```
const rules = harness.session.permissions.getRules();
await harness.session.permissions.setForCategory({ category: 'execute', policy: 'ask' });
await harness.session.permissions.setForTool({ toolName: 'dangerous_tool', policy: 'deny' });
```
Removed Harness.getPermissionRules, Harness.setPermissionForCategory, and Harness.setPermissionForTool. The setters now return a promise that resolves once the change is persisted to session state, so callers that read the rules back can await the write. Tool-category resolution stays on the harness as harness.getToolCategory() since it reads harness config rather than session state.
mastracode is updated to consume the new API: the /permissions command reads and sets policies via session.permissions.
Replaced Harness.getModelName() and Harness.getFullModelId() with session accessors. The full model id is read via the existing harness.session.model.get(), and the short display name moves to a new harness.session.model.displayName(). (#18200)
Before
```
const name = harness.getModelName();
const fullId = harness.getFullModelId();
```
After
```
const name = harness.session.model.displayName();
const fullId = harness.session.model.get();
```
mastracode is updated to consume the new API: the TUI status line and message renderer now read the model id via harness.session.model.get().
Added item-level static tool mocks so agent experiments can run deterministically without calling real, side-effecting tools. (#18036)
A dataset item can now declare toolMocks. When the agent calls a mocked tool with matching arguments, the experiment serves the recorded output instead of executing the tool. Mocks for the same (toolName, args) are consumed in order, so repeated calls can return different outputs. If a mocked tool is called with arguments that do not match (or the mocks are exhausted), the item fails immediately and the agent is stopped so it cannot keep calling tools after a failure. Tools without a mock still run live.
```
await dataset.addItem({
  input: { question: 'What is the weather in Seattle?' },
  toolMocks: [
    {
      toolName: 'getWeather',
      args: { city: 'Seattle' },
      output: { temperatureF: 52 },
      // 'strict' (default) deep-compares args; 'ignore' matches on tool name only,
      // useful for sub-agent calls where the prompt is LLM-authored.
      matchArgs: 'strict',
    },
  ],
});
```
Each item result carries a toolMockReport describing which mocks were served, which went unconsumed, and which tools ran live, so you can see exactly how a run behaved.
Items that declare toolMocks run their tools sequentially (toolCallConcurrency: 1) within that item run to guarantee ordered consumption. Items without mocks are unaffected.
Moved the subagent model accessors off the Harness onto session.subagents.model. Reading and setting the global or per-agentType subagent model now lives on the session, next to the state it reads and writes, and is grouped under session.subagents to leave room for future subagent settings. (#18200)
Before
```
const modelId = harness.getSubagentModelId({ agentType: 'explore' });
await harness.setSubagentModelId({ modelId: 'anthropic/claude-sonnet-4', agentType: 'explore' });
```
After
```
const modelId = harness.session.subagents.model.get({ agentType: 'explore' });
await harness.session.subagents.model.set({ modelId: 'anthropic/claude-sonnet-4', agentType: 'explore' });
```
get() prefers the per-agentType value, then the global subagent model, returning null when neither is set. set() persists to thread settings and emits a subagent_model_changed event. Removed Harness.getSubagentModelId and Harness.setSubagentModelId.
mastracode is updated to consume the new API: the /subagents command, model-pack activation, and startup model restoration read and set subagent models via session.subagents.model.

Patch Changes

Update provider registry and model documentation with latest models and providers (5bd72d2)
Added unit test coverage for agent/durable/run-registry.ts RunRegistry and ExtendedRunRegistry classes. No behaviour changes — purely additive test coverage. (#18265)
Guard CacheKeyGenerator.fromAIV4Part against reasoning parts with empty/undefined details text. Models that emit an empty reasoning summary (Anthropic Opus 4.7/4.8 with thinking display: omitted, OpenAI gpt-5.x via the Responses API with no summary) persist a reasoning part shaped { type: 'reasoning', reasoning: '', details: [{ type: 'text' }] } — the text detail has no text field. On the next turn, Observational Memory reloads that message and the cache-key generator crashed with TypeError: Cannot read properties of undefined (reading 'length'), killing the whole turn (PROCESSOR_WORKFLOW_FAILED). This is the reasoning-branch sibling of the tool-invocation guard (#16756 / #16773). The reduce now tolerates missing details and missing detail text; no behavior change for well-formed parts. Fixes #18280. (#18281)
Introduce the SessionMachinery injection boundary on the Harness Session. This formalizes the narrow set of Harness-owned capabilities (resolve the current agent, build run/stream options + toolsets + request context, persist token usage, generate ids, open a thread subscription) that a Session leverages to drive an agent run. The Harness injects this machinery into each Session it constructs via session.setMachinery(...). (#18213)
This is the dependency-injection foundation for making the run loop, run state, and thread stream session-owned (so one Harness can serve many concurrent sessions). No behavior change in this step — the machinery is wired but not yet consumed.
Fix notification signals not waking idle threads (#18244)
Agents now receive rich text (markdown) from channel messages instead of stripped plain text. Links, bold, italic, code, blockquotes, and other formatting from Slack (and other platforms) are preserved as standard markdown that LLMs understand natively. Previously, a Slack message like 'Check out https://example.com|Example' would arrive as 'Check out Example' — the URL was lost. Now it arrives as 'Check out Example'. (#18109)
Added an optional delayMs retry delay to StreamErrorRetryProcessor. Consumers can now wait before retrying transient errors, accepting either a fixed number of milliseconds or a function evaluated with the error args. Existing default behavior is unchanged when the option is not supplied. (#18370)
```
import { StreamErrorRetryProcessor } from '@mastra/core/processors';

new StreamErrorRetryProcessor({
  maxRetries: 2,
  delayMs: ({ retryCount }) => Math.min(1000 * 2 ** retryCount, 30000),
  matchers: [error => error?.code === 'ECONNRESET'],
});
```
Added unit test coverage for channels/inline-media.ts and workspace/tools/output-helpers.ts. No behaviour changes — purely additive test coverage. (#18006)
Fixed agent channel initialization errors being silently swallowed. When an agent configured with channels failed to initialize during startup, the error was discarded by an un-awaited promise, leaving the channel dead with nothing logged. Initialization failures are now caught and logged through the Mastra logger so a misconfiguration surfaces clearly. (#17720)
Fixed an issue where publishing instruction-only or model-only overrides could remove tools from request-scoped createDurableAgent agents. (#18121)
Request-scoped agents now stay durable and preserve code-owned tools plus delegated behavior (model and memory).
Fix DurableAgent.prepare() ignoring options.runId. prepare() did not forward runId to prepareForDurableExecution() (unlike stream()), so it always registered a freshly minted run id. This made prepare() unusable for rehydrating a persisted, suspended run in a fresh process (e.g. after a server restart or registry eviction): a follow-up resume(runId) couldn't find the registry entry prepare() had built and threw No registry entry found for run … Cannot resume.. prepare() now forwards the caller-provided runId, so re-registering a known run id and resuming a durable snapshot across a restart works. (#18113)
Fixed durable agent input/output processor spans orphaning when an AGENT_RUN root was present. Following #18083, durable runs opened an AGENT_RUN span but prepareForDurableExecution and the durable agentic-loop output-processor step still passed {} as any as the observability context to runInputProcessors / runOutputProcessors. Agent-level processors (including the auto-injected MessageHistory when memory is configured) emitted processor_run spans with no parent — and their inner MEMORY_OPERATION children were dropped entirely because the processor bails out when currentSpan is undefined. The AGENT_RUN span is now opened before input processors run and the durable workflow's output-processor step forwards its step tracingContext to the runner, so processor and memory-operation spans nest under AGENT_RUN on every durable turn. (#18344)
Fixed a crash when the goal judge stream outlives the main agent stream. The emitJudgeActivity helper now uses safeEnqueue (try/catch guard) instead of raw controller.enqueue(), preventing TypeError: Controller is already closed when the ReadableStream controller closes before the fire-and-forget judge observer finishes draining. (#18196)
Fixed channel handlers so background tasks finish before responses are posted. (#16343)
The channel handler was calling agent.stream(), which closes as soon as the
model finishes generating text. Any agent.backgroundTask() calls scheduled
during the turn were silently abandoned before they could complete.
Switch the call site to use untilIdle: true so the channel waits for all
background tasks to finish before posting the response and releasing the thread.
Fixes #16163
CompositeAuth now supports credentials-based authentication. When a credentials provider is included, signIn, signUp, and related methods are available on the composite — so Studio shows the sign-in form and the credentials endpoint responds correctly. (#17708)
Fixed follow-up messages being lost after interrupting a stream. When a user aborted a run (e.g. Ctrl+C) and then immediately sent a new message, the follow-up never received a response. (#18381)
Two issues were addressed in the harness session:
- When an aborted run terminated the subscribed-thread consumer loop, the live subscription was left attached but no longer drained. A follow-up signal would start a new run on that subscription, but its chunks were never processed. The run engine now detaches the subscription when the consumer loop breaks on abort, so the next signal re-subscribes and starts a fresh consumer.
- Session.sendSignal could dispatch a signal onto the dying run because abort() clears the AbortController immediately while the run id and active-run id linger until run.reset() runs after agent_end. sendSignal now detects the post-abort window (an abort was requested but the run has not reset) and waits for the stream to fully idle before starting a new run.
Fixed tool calls running in parallel even when a tool that requires approval or can suspend was available in a step. This could let tool calls bypass an approval step when the model didn't call the approval tool that turn. (#18275)
Tool calls now run one at a time whenever an approval or suspending tool is available in a step. Parallel tool calls are still allowed when no such tool is available.
Added unit test coverage for processors/step-schema.ts Zod validation schemas. No behaviour changes — purely additive test coverage. (#18264)
Fixed harness follow-up messages sent immediately after an abort so they wait for the aborted stream to finish cleaning up before starting the next run. (#18390)
Fixed Harness runs so they no longer send a default temperature when the caller did not configure model settings. (#18147)
Expose Harness sessions over HTTP. (#18358)
Adds a set of harness-scoped server routes that let a registered Harness be
driven over HTTP: create (get-or-create) a session by resourceId, send
messages, steer, abort, approve/decline tool calls, respond to tool
suspensions, switch mode/model, manage threads, read session state, and
subscribe to the session's event stream via SSE. Routes resolve the target
Harness through mastra.getHarness(id) and operate on the session returned by
harness.createSession(...).
A new harness permission resource is included (harness:read,
harness:execute).
The tool-approval route forwards the request's toolCallId so a stale or
delayed approval can only resolve the gate it targets, and the list-models
route no longer returns API key environment variable names.
Enforce resource ownership on Harness session thread operations. SessionThread.switch, delete, listMessages, and clone now verify the target thread belongs to the session's resourceId before acting, treating threads owned by another resource as not found. This prevents a session (e.g. an authenticated HTTP caller scoped to one resourceId) from reading, switching to, renaming, deleting, or cloning a thread owned by a different resource via an arbitrary threadId. When a switch is rejected for ownership, the session's previous thread lock is restored so it is never left bound but unlocked. (#18358)
Added BDD-style AIMock scenario tests for the core agentic loop to guard against multi-step composition regressions. Covers tool-result plumbing, cross-turn message ordering, stop conditions, tool approval/resume, structured output, active-tools filtering, output processors, memory/working-memory recall, input/output processors, prepare-step overrides, workspace integration, subagent delegation, dynamic instructions, isTaskComplete gating, text-streaming fidelity, provider errors, guardrail tripwires, background tasks (tool-level, agent-level, streamUntilIdle), goals (satisfied, budget exhausted), tool-level requireApproval, conditional requireToolApproval functions, supervisor delegation hooks (onDelegationStart prompt modification and rejection, messageFilter), onIterationComplete iteration tracking with early stop, multi-tool parallel execution with concurrent tool calls, fullStream chunk ordering (text-start/text-delta/text-end, step/finish lifecycle), abort signal mid-stream halting, runtime context (requestContext) passthrough to tools, per-step input/output processors, provider metadata passthrough, model settings request body override, request-level toolsets merge with agent-level tools, tool lifecycle hooks (onInputAvailable, onOutput), tool streaming with context.writer, observability context in tool execution, structured output schema validation failures with detailed ZodError messages, multiple isTaskComplete scorers with strategy semantics (all/any), processor sequencing and transformation, maxSteps boundary conditions with multiple stopWhen OR logic, error processors with retry count tracking and custom exhaustion logic, lifecycle callbacks (onStepFinish, onFinish), incremental message persistence (savePerStep), actor identity passthrough, workflows-as-tools integration with workflow tool execution and result flow, abort during tool execution with tool-level abort signal detection and early bail, error processor retry exhaustion with retryCount incrementation and state persistence across attempts, structured output validation repair with error chunk emission and partial JSON streaming, concurrent approval requests with independent approve/decline decisions, memory thread switching with conversation isolation across threads, nested agent delegation with multi-level agent-as-tools chains, structured output aggregation from multiple tool results, memory recall windowing with lastMessages configuration, empty/no-tool turn handling, abort signal interaction with structured output streaming, requestContext isolation across multiple tool execution steps, onError callback behavior (API errors vs tool errors), maxSteps/stopWhen interaction in long tool chains, structured output error strategies (strict emits error chunk, fallback returns fallbackValue, warn logs without error chunk), requestContext mutation behavior (tool mutations do NOT persist between executions, each tool sees original values), runtime tool suspension (tools calling suspend() mid-execution with resume via resumeStream()), delegation completion hooks (onDelegationComplete with bail() stopping loop immediately), supervisor context control (includeSubAgentToolResultsInModelContext controlling nested tool result pollution), auto-resume of suspended tools (autoResumeSuspendedTools detecting and resuming on next call), manual resume via resumeStream() with custom resumeData, and non-streaming generate() approval path (approveToolCallGenerate/declineToolCallGenerate). Extended the AIMock harness with createSharedAgent() helper and sharedAgent option to enable scenarios that require shared Mastra storage across multiple agent calls. Also covers thread signals (subscribeToThread/sendMessage/sendStateSignal), signal edge cases (multiple subscribers, unsubscribe cleanup, state-signal cache dedup), and signal delivery to idle, non-subscribed threads (sendMessage still wakes a run; sendStateSignal persists without waking). The test suite now includes 179 scenarios across 83 test files. A test-quality audit additionally hardened several abort and error-processor scenarios that previously could pass regardless of loop behavior: rewrote abort-structured-output and the abort-during-tool-execution bail test to assert deterministic outcomes (finishReason and suppressed post-abort requests) instead of swallowing errors, replaced a tautological >= 0 delegation assertion in nested-tool-calls with a real two-level delegation check, and pinned the error-processor retry-exhaustion chain to its exact call order. Each was verified by disabling the relevant loop wiring and confirming the scenario now fails. Internal test-only change with no runtime impact. (#18276)
Exported isBadRequestError matcher for detecting transient HTTP 400 errors that can be retried (#18384)
Make the evented workflow engine safe for agent streams that carry non-serializable runtime state. (#17836)
- Agent streams no longer drop or flatten Date, Error, Map, Set, RegExp, URL, BigInt, undefined, or registered class instances (e.g. GeneratedFile) when workflow events travel across the cross-process pubsub broker.
- New per-run RunScope on Mastra keeps live runtime handles (message lists, memory, tools, background tasks, transports, …) off the wire entirely. The scope is keyed by runId, never persisted, never published, and is released when the run ends.
- Migrated the agent's prepare-stream, agentic-execution, and agentic-loop workflow steps onto this scope. The legacy _internal field on streamVNext() options is still accepted as bootstrap input — it is hydrated into the scope once and marked @deprecated; no caller changes are required.
Resolves intermittent getFullOutput is not a function and Workflow not found errors on multi-instance deployments running the evented workflow engine.
Fixed browser state updates to attribute click-driven navigation to the assistant when the browser click result reports the new URL. (#18239)
Fixed subscribed thread streams so suspended tool resumes, same-run resumed streams, follow-up signals, and post-abort queued context are delivered through the authoritative subscription path without dropping or duplicating output. (#18183)
Add AIMock scenario tests for dynamic model resolution, client tools, and tool choice (#18276)
Added 3 new BDD-style scenario tests (7 test cases) to the AIMock regression test suite:
- dynamic-model.scenario.test.ts - Tests that model resolution functions receive requestContext and can select different models per-request (e.g., fast vs. smart model based on context flags)
- client-tools.scenario.test.ts - Tests that client tools passed to agent.stream() merge correctly with agent-level tools, both appear in model requests, and execute successfully
- tool-choice.scenario.test.ts - Tests that toolChoice option passes through to the model request with correct values for 'none', 'required', and specific tool selection ({ type: 'tool', toolName: 'name' })
All scenarios run against a real OpenAI provider pointed at an in-test AIMock HTTP server, providing regression coverage for tool resolution and model selection logic in the agentic loop.
Fixed parallel tool calls being serialized by unrelated suspendable tools. (#18243)
Fixed parallel sub-agent delegations that require approval. When a supervisor agent delegated the same sub-agent twice in a single step (for example, issuing two refunds in parallel), approving them one at a time only ran the first delegation. The second failed to resume with an "AGENT_RESUME_NO_SNAPSHOT_FOUND" error, and on a page refresh the second delegation's approval was lost entirely. (#18041)
Now each delegation tracks its own suspended run, so approving both parallel delegations runs both of them, both during a live session and after reloading. Studio also resolves each delegation's suspend payload by tool call id, so parallel approvals render the correct payload per delegation.
Before
```
// Supervisor delegates two refunds to the billing agent in one step
await supervisor.stream('Refund order A and order B in parallel.');

// Approving each one by one
await supervisor.approveToolCall({ runId, toolCallId: callA }); // runs refund A
await supervisor.approveToolCall({ runId, toolCallId: callB }); // error: AGENT_RESUME_NO_SNAPSHOT_FOUND, refund B never runs
```
After
```
await supervisor.approveToolCall({ runId, toolCallId: callA }); // runs refund A
await supervisor.approveToolCall({ runId, toolCallId: callB }); // runs refund B
```
Fixed DurableAgent ignoring the wrapped agent's defaultOptions. When wrapping an agent with createDurableAgent, the agent's configured defaultOptions (maxSteps, providerOptions, modelSettings, etc.) were silently dropped — maxSteps fell back to the durable default of 5 and provider settings like Anthropic thinking config were never sent. DurableAgent now merges the wrapped agent's defaultOptions under each per-request call, matching Agent.stream()/generate(), and delegates getDefaultOptions() to the wrapped agent. (#17794)
Before:
const base = new Agent({ model, defaultOptions: { maxSteps: 250 } });
const agent = createDurableAgent({ agent: base });
// runs capped at 5 steps, defaultOptions.providerOptions dropped
After:
// defaultOptions.maxSteps (250) and providerOptions are honored; per-request options still take precedence
Move the Harness agent run engine onto the Session. The stream loop that consumes an agent's event stream — folding chunks into display messages and token usage, driving tool approval/suspension, and finalizing the run — now lives in a per-session SessionRunEngine owned by the Session and driven through the injected SessionMachinery. The pure chunk→message content transforms move to a shared stream-content module. (#18213)
In the multi-user host the run loop, run state, and thread stream are per-session and cannot be shared, so they belong on the Session; how a run is produced (agent + config-backed builders) stays Harness-owned machinery. Behavior is unchanged: harness.session.processStream(...) and session.resolveToolApproval(...) replace the previously Harness-private equivalents.
Internal: extracted the Harness constructor's session wiring (thread-settings store, mode/model/om/permissions/subagents resolvers, thread data store, and initial mode/model seeding) into a private #wireSession(session, defaultMode) helper. No behavior or public API change — this is groundwork for wiring additional sessions to a single Harness. (#18213)

@mastra/archil@0.2.0

Minor Changes

Added @mastra/archil — an Archil filesystem provider for Mastra workspaces, backed by Archil's elastic, serverless filesystems for AI agents. Exposes ArchilFilesystem and the archilFilesystemProvider descriptor for MastraEditor, supporting creating disks, reading/writing files, running commands, and searching. (#18275)

Usage

import { archilFilesystemProvider } from '@mastra/archil';

// Register the provider with a MastraEditor, configuring it with either an
// existing disk or options to create one on init.
const provider = archilFilesystemProvider;

// { diskId: 'dsk-0123456789abcdef' } or { createDiskOptions: { ... } }
// apiKey falls back to the ARCHIL_API_KEY env var.

Patch Changes

@mastra/chroma@1.1.1

Patch Changes

Fixed similarity scoring in ChromaVector.query() for non-cosine indexes. (#18350)
- Euclidean indexes now return bounded, positive scores instead of unbounded negative ones.
- Dotproduct indexes now use the correct score conversion.
- A missing distance now scores 0 instead of a perfect 1.
- minScore filtering and rerank weighting now behave consistently with the other Mastra vector stores.

@mastra/client-js@1.27.0

Minor Changes

Add a harness resource to the client SDK. (#18358)
MastraClient now exposes listHarnesses() and getHarness(id). A
Harness scopes to a harness registered on the connected Mastra instance, and
harness.session(resourceId) returns a HarnessSession that can create/resume
a session, send messages, steer, abort, approve/decline tool calls, respond to
tool suspensions, switch mode/model, manage threads, send notifications, read
state, and subscribe to the session's event stream over SSE.
```
const client = new MastraClient({ baseUrl: 'http://localhost:4111' });
const harness = client.getHarness('code');
const session = harness.session('user-1');

const subscription = await session.subscribe({ onEvent: event => console.log(event) });

await session.create();
await session.sendMessage('Summarize this PR');

// later
subscription.unsubscribe();
```

Patch Changes

Extend DatasetItemSource['type'] with 'candidate-screener'. (#18314)
Mirrors the @mastra/core enum extension so externally-materialized dataset items round-trip through the client SDK without type errors.
Expose item-level tool mocks through the dataset API and client SDK. Dataset item create/update/batch endpoints accept a toolMocks array (toolName + args + output + optional matchArgs mode), experiment result responses include the toolMockReport, and the client-js types thread toolMocks and toolMockReport through the dataset item and experiment result types. (#18037)
```
// Author a dataset item with a tool mock the agent will replay during experiments
await client.addDatasetItem({
  datasetId,
  input: { question: 'What is the weather in Tokyo?' },
  toolMocks: [{ toolName: 'getWeather', args: { city: 'Tokyo' }, output: { tempC: 18 } }],
});
```

@mastra/deployer-vercel@1.2.1

Patch Changes

Fixed Vercel Studio deployments so the root URL serves the Studio app when Studio assets are enabled. (#18325)

@mastra/dsql@1.1.1

Patch Changes

Fixed SQL query failures when filtering threads by resourceId or metadata in Aurora DSQL storage (#18311)

@mastra/editor@0.13.1

Patch Changes

Fix editor.agent.update() to persist editable agent fields by creating and activating a new version. (#18096)
Fixed prompt block SDK updates to persist editable fields. (#17088)

@mastra/express@1.4.1

Patch Changes

Fixed workflow and agent HTTP streams silently dying when a stream chunk contained values that cannot be serialized to JSON (such as BigInt produced by zod coercions in structuredOutput schemas). In Studio this made workflow step nodes appear stuck in the "running" state even though the run completed successfully on the server. (#17843)
Unserializable values are now safely converted (BigInt to string, circular references to "[Circular]"). If a chunk still cannot be serialized at all, it is skipped with an error log that includes the route path and reason, instead of killing the stream and dropping all remaining chunks. Fixes #17821

@mastra/fastify@1.4.1

Patch Changes

Fixed workflow and agent HTTP streams silently dying when a stream chunk contained values that cannot be serialized to JSON (such as BigInt produced by zod coercions in structuredOutput schemas). In Studio this made workflow step nodes appear stuck in the "running" state even though the run completed successfully on the server. (#17843)
Unserializable values are now safely converted (BigInt to string, circular references to "[Circular]"). If a chunk still cannot be serialized at all, it is skipped with an error log that includes the route path and reason, instead of killing the stream and dropping all remaining chunks. Fixes #17821
Fix crash on every request when deployed with @mastra/core < 1.42.0. The fastify, hono, and koa server adapters called this.mastra.getStudio() non-optionally during RBAC pre-checks. On older core versions that method doesn't exist on the Mastra class, so every request threw TypeError: this.mastra.getStudio is not a function and returned a 500 — even for projects with no auth configured. The call site now uses optional chaining (getStudio?.()), matching the pattern already applied in @mastra/server (#18075), and the adapters gracefully fall back to server-only auth. (#18319)

@mastra/github-signals@0.2.1

Patch Changes

Fix notification signals not waking idle threads (#18244)

@mastra/google-cloud-pubsub@1.1.1

Patch Changes

Honor the localOnly publish option so in-process subscribers can receive events without round-tripping through the broker. (#17836)
This matches the contract already implemented by UnixSocketPubSub in @mastra/core: when Mastra tags an internal workflow event as localOnly, the payload is delivered by reference to local subscribers and the broker is skipped entirely. Live runtime values like MastraModelOutput instances now keep their prototypes when the evented agent loop runs against a Redis Streams or Google Cloud Pub/Sub broker, fixing output.consumeStream is not a function style failures.
Fixed a startup race where concurrent subscribers to the same ungrouped topic could fail to attach. When a producer's agent.stream() and a consumer's agent.observe() subscribe to a fresh run topic within Google Cloud Pub/Sub's subscription-creation window, both raced to create the same subscription. The loser received an ALREADY_EXISTS error and, for ungrouped topics, fell through and threw Failed to subscribe to topic, killing the observe attach. Concurrent init() calls are now coalesced into a single create attempt, and an ALREADY_EXISTS result attaches to the existing subscription regardless of whether a group is set. (#18252)

@mastra/hono@1.5.1

Patch Changes

Fixed workflow and agent HTTP streams silently dying when a stream chunk contained values that cannot be serialized to JSON (such as BigInt produced by zod coercions in structuredOutput schemas). In Studio this made workflow step nodes appear stuck in the "running" state even though the run completed successfully on the server. (#17843)
Unserializable values are now safely converted (BigInt to string, circular references to "[Circular]"). If a chunk still cannot be serialized at all, it is skipped with an error log that includes the route path and reason, instead of killing the stream and dropping all remaining chunks. Fixes #17821
Fix crash on every request when deployed with @mastra/core < 1.42.0. The fastify, hono, and koa server adapters called this.mastra.getStudio() non-optionally during RBAC pre-checks. On older core versions that method doesn't exist on the Mastra class, so every request threw TypeError: this.mastra.getStudio is not a function and returned a 500 — even for projects with no auth configured. The call site now uses optional chaining (getStudio?.()), matching the pattern already applied in @mastra/server (#18075), and the adapters gracefully fall back to server-only auth. (#18319)

@mastra/inngest@1.7.0

Minor Changes

Added untilIdle option to InngestAgent.stream() — pass untilIdle: true or { maxIdleMs } to keep the stream open across background-task continuations, matching the DurableAgent and non-durable Agent APIs. (#18349)
```
const result = await inngestAgent.stream('Research topic', {
  untilIdle: true,
  memory: { thread: 't1', resource: 'u1' },
});
```

Patch Changes

@mastra/koa@1.6.1

Patch Changes

Fixed workflow and agent HTTP streams silently dying when a stream chunk contained values that cannot be serialized to JSON (such as BigInt produced by zod coercions in structuredOutput schemas). In Studio this made workflow step nodes appear stuck in the "running" state even though the run completed successfully on the server. (#17843)
Unserializable values are now safely converted (BigInt to string, circular references to "[Circular]"). If a chunk still cannot be serialized at all, it is skipped with an error log that includes the route path and reason, instead of killing the stream and dropping all remaining chunks. Fixes #17821
Fix crash on every request when deployed with @mastra/core < 1.42.0. The fastify, hono, and koa server adapters called this.mastra.getStudio() non-optionally during RBAC pre-checks. On older core versions that method doesn't exist on the Mastra class, so every request threw TypeError: this.mastra.getStudio is not a function and returned a 500 — even for projects with no auth configured. The call site now uses optional chaining (getStudio?.()), matching the pattern already applied in @mastra/server (#18075), and the adapters gracefully fall back to server-only auth. (#18319)

@mastra/lance@1.1.1

Patch Changes

Fixed LanceVectorStore.query() returning a raw LanceDB distance in the score field, which inverted ranking compared to every other Mastra vector store. (#18104)
LanceDB's _distance is a distance (lower = more similar), while Mastra's score is a similarity (higher = more similar). Returning the distance unchanged meant the closest match got the lowest score, silently breaking Memory semantic recall, rerank() vector weighting, and any minScore/threshold filtering written against other stores (pg, Chroma, S3 Vectors, Pinecone, …).
query() now converts _distance into a similarity score consistent with the other stores and sets the search distance type to match the detected index metric, or an explicit query metric when no physical Lance index exists:
- cosine → 1 - distance (cosine similarity)
- dot product → 1 - distance (recovers the dot product, matching @mastra/pg)
- euclidean → 1 / (1 + sqrt(distance)) (Lance l2 returns squared L2, so this maps to Mastra's L2 similarity semantics)
The metric defaults to the table's vector index metric when one exists, otherwise cosine (matching createIndex's default). For small/unindexed tables where LanceDB has no physical index metadata to inspect, pass metric to query() when using a non-cosine metric. If a query metric conflicts with an existing Lance index metric, the index metric is used because Lance requires indexed searches to use the index's distance type:
```
// Before: `exact` got score 0, `far` got score 2 — ranking inverted.
// After:  `exact` gets the highest score and ranks first.
const results = await store.query({
  indexName: 'docs',
  queryVector: [1, 0, 0],
  topK: 2,
  metric: 'cosine', // optional; resolved from the index by default
});
```

@mastra/libsql@1.14.1

Patch Changes

Added multi-tenant scoping columns (organizationId, projectId) to the experiments domain so experiment records and per-item results inherit the tenancy bucket of their parent dataset. (#18388)
Experiment, ExperimentResult, CreateExperimentInput, and AddExperimentResultInput now carry optional organizationId / projectId fields. ListExperimentsInput and ListExperimentResultsInput gain a filters: ExperimentTenancyFilters block (mirrors DatasetTenancyFilters) for scoping queries within a (organizationId, projectId) bucket. Tenancy is hydrated from the parent dataset on createExperiment and denormalized onto each ExperimentResult for efficient tenancy-scoped queries.
The corresponding columns are also added to the mastra_experiments and mastra_experiment_results table schemas. Existing rows backfill to null, matching the rest of the dataset-tenancy surface.
This release also clarifies the targetType contract via JSDoc:
- CreateDatasetInput.targetType remains optional. Datasets without a TargetType are not experiment-eligible — the experiment runner requires a non-null CreateExperimentInput.targetType to resolve an executor.
- Experiment.targetType / CreateExperimentInput.targetType stay required. An experiment by definition replays inputs against a specific target.
No behavior change for existing OSS-created experiments; the new fields are additive and optional.
Example:
```
// Create an experiment scoped to a tenancy bucket. When the parent dataset
// already carries `organizationId` / `projectId`, `runExperiment` hydrates
// these fields automatically from the dataset record.
const experiment = await storage.createExperiment({
  name: 'qa-regression',
  datasetId: 'ds_123',
  datasetVersion: 1,
  targetType: 'agent',
  targetId: 'agent_qa',
  totalItems: 10,
  organizationId: 'org_123',
  projectId: 'proj_123',
});

// List experiments within a tenancy bucket.
const experiments = await storage.listExperiments({
  pagination: { page: 0, perPage: 20 },
  filters: { organizationId: 'org_123', projectId: 'proj_123' },
});

// List per-item results within the same bucket.
const results = await storage.listExperimentResults({
  experimentId: experiment.id,
  pagination: { page: 0, perPage: 50 },
  filters: { organizationId: 'org_123', projectId: 'proj_123' },
});
```
Persist and filter dataset tenancy + candidate identity in storage adapters. (#18314)
createDataset now persists organizationId, projectId, candidateKey, and candidateId. listDatasets and listItems accept matching tenancy filters. Dataset items inherit organizationId / projectId from their parent dataset on insert, update, delete, and batch insert/delete — items are never settable per call (item tenancy follows dataset tenancy).
All new columns are nullable and added retroactively via each adapter's existing column-migration path; no breaking DDL. Existing rows continue to read and write fine; new writes can choose to stamp tenancy.
```
await storage.createDataset({
  name: 'candidates/missing-tool-call/incident-123',
  organizationId: 'org_abc',
  projectId: 'project_xyz',
  candidateKey: 'missing-tool-call',
  candidateId: 'incident-123',
});

await storage.listDatasets({
  pagination: { page: 0, perPage: 20 },
  filters: { organizationId: 'org_abc', projectId: 'project_xyz' },
});
```
Fixed: mastra build output no longer hangs on the first storage-touching request when an app uses LibSQLStore, PostgresStore, or MySQLStore with observational memory. mastra dev was unaffected; only the bundled mastra start output deadlocked. No code changes or bundler.externals workaround required on the app side after upgrading. (#18302)
Added storage for item-level tool mocks. Dataset items persist their toolMocks and experiment results persist their toolMockReport, so mocks and run diagnostics survive across sessions. (#18036)

@mastra/mcp@1.12.0

Minor Changes

Added MCPClient.listToolsWithErrors() to return namespaced tools alongside per-server discovery errors. (#18030)

Example:

const { tools, errors } = await mcp.listToolsWithErrors();

new Agent({
  name: 'assistant',
  tools,
});

if (Object.keys(errors).length > 0) {
  console.error(errors);
}

Patch Changes

@mastra/memory@1.21.1

Patch Changes

Fixed direct memory recall so semantic recall score thresholds filter recalled messages. (#18211)
Fixed semantic recall failing on long unbroken content. When a memory-enabled agent received a message containing a single very long whitespace-free string — a base64 data URI, a minified JS/JSON blob, a long URL, or spaceless CJK text — embedding could throw a provider "maximum context length" error and break that turn's persistence or recall. (#18236)
chunkText now hard-splits any single word that is longer than the chunk budget so every chunk stays under the embedder's token limit, and it no longer emits an empty leading chunk when the first word is oversized.

@mastra/mongodb@1.11.0

Minor Changes

Fixed new observability span writes in MongoDB so startedAt, endedAt, createdAt, and updatedAt are stored as native BSON Date objects. Existing string-typed span dates remain readable and date filters support both old string values and new Date values. (#18366)

Patch Changes

Added multi-tenant scoping columns (organizationId, projectId) to the experiments domain so experiment records and per-item results inherit the tenancy bucket of their parent dataset. (#18388)
Experiment, ExperimentResult, CreateExperimentInput, and AddExperimentResultInput now carry optional organizationId / projectId fields. ListExperimentsInput and ListExperimentResultsInput gain a filters: ExperimentTenancyFilters block (mirrors DatasetTenancyFilters) for scoping queries within a (organizationId, projectId) bucket. Tenancy is hydrated from the parent dataset on createExperiment and denormalized onto each ExperimentResult for efficient tenancy-scoped queries.
The corresponding columns are also added to the mastra_experiments and mastra_experiment_results table schemas. Existing rows backfill to null, matching the rest of the dataset-tenancy surface.
This release also clarifies the targetType contract via JSDoc:
- CreateDatasetInput.targetType remains optional. Datasets without a TargetType are not experiment-eligible — the experiment runner requires a non-null CreateExperimentInput.targetType to resolve an executor.
- Experiment.targetType / CreateExperimentInput.targetType stay required. An experiment by definition replays inputs against a specific target.
No behavior change for existing OSS-created experiments; the new fields are additive and optional.
Example:
```
// Create an experiment scoped to a tenancy bucket. When the parent dataset
// already carries `organizationId` / `projectId`, `runExperiment` hydrates
// these fields automatically from the dataset record.
const experiment = await storage.createExperiment({
  name: 'qa-regression',
  datasetId: 'ds_123',
  datasetVersion: 1,
  targetType: 'agent',
  targetId: 'agent_qa',
  totalItems: 10,
  organizationId: 'org_123',
  projectId: 'proj_123',
});

// List experiments within a tenancy bucket.
const experiments = await storage.listExperiments({
  pagination: { page: 0, perPage: 20 },
  filters: { organizationId: 'org_123', projectId: 'proj_123' },
});

// List per-item results within the same bucket.
const results = await storage.listExperimentResults({
  experimentId: experiment.id,
  pagination: { page: 0, perPage: 50 },
  filters: { organizationId: 'org_123', projectId: 'proj_123' },
});
```
Persist and filter dataset tenancy + candidate identity in storage adapters. (#18314)
createDataset now persists organizationId, projectId, candidateKey, and candidateId. listDatasets and listItems accept matching tenancy filters. Dataset items inherit organizationId / projectId from their parent dataset on insert, update, delete, and batch insert/delete — items are never settable per call (item tenancy follows dataset tenancy).
All new columns are nullable and added retroactively via each adapter's existing column-migration path; no breaking DDL. Existing rows continue to read and write fine; new writes can choose to stamp tenancy.
```
await storage.createDataset({
  name: 'candidates/missing-tool-call/incident-123',
  organizationId: 'org_abc',
  projectId: 'project_xyz',
  candidateKey: 'missing-tool-call',
  candidateId: 'incident-123',
});

await storage.listDatasets({
  pagination: { page: 0, perPage: 20 },
  filters: { organizationId: 'org_abc', projectId: 'project_xyz' },
});
```
Added storage for item-level tool mocks. Dataset items persist their toolMocks and experiment results persist their toolMockReport, so mocks and run diagnostics survive across sessions. (#18036)

@mastra/mysql@0.3.0

Minor Changes

The MySQL store now rejects item-level tool mocks with a clear error instead of silently dropping them. Tool mock persistence is not yet supported on MySQL, so saving a dataset item with toolMocks (or an experiment result with a toolMockReport) fails fast rather than discarding the data. (#18036)

Patch Changes

Added multi-tenant scoping columns (organizationId, projectId) to the experiments domain so experiment records and per-item results inherit the tenancy bucket of their parent dataset. (#18388)
Experiment, ExperimentResult, CreateExperimentInput, and AddExperimentResultInput now carry optional organizationId / projectId fields. ListExperimentsInput and ListExperimentResultsInput gain a filters: ExperimentTenancyFilters block (mirrors DatasetTenancyFilters) for scoping queries within a (organizationId, projectId) bucket. Tenancy is hydrated from the parent dataset on createExperiment and denormalized onto each ExperimentResult for efficient tenancy-scoped queries.
The corresponding columns are also added to the mastra_experiments and mastra_experiment_results table schemas. Existing rows backfill to null, matching the rest of the dataset-tenancy surface.
This release also clarifies the targetType contract via JSDoc:
- CreateDatasetInput.targetType remains optional. Datasets without a TargetType are not experiment-eligible — the experiment runner requires a non-null CreateExperimentInput.targetType to resolve an executor.
- Experiment.targetType / CreateExperimentInput.targetType stay required. An experiment by definition replays inputs against a specific target.
No behavior change for existing OSS-created experiments; the new fields are additive and optional.
Example:
```
// Create an experiment scoped to a tenancy bucket. When the parent dataset
// already carries `organizationId` / `projectId`, `runExperiment` hydrates
// these fields automatically from the dataset record.
const experiment = await storage.createExperiment({
  name: 'qa-regression',
  datasetId: 'ds_123',
  datasetVersion: 1,
  targetType: 'agent',
  targetId: 'agent_qa',
  totalItems: 10,
  organizationId: 'org_123',
  projectId: 'proj_123',
});

// List experiments within a tenancy bucket.
const experiments = await storage.listExperiments({
  pagination: { page: 0, perPage: 20 },
  filters: { organizationId: 'org_123', projectId: 'proj_123' },
});

// List per-item results within the same bucket.
const results = await storage.listExperimentResults({
  experimentId: experiment.id,
  pagination: { page: 0, perPage: 50 },
  filters: { organizationId: 'org_123', projectId: 'proj_123' },
});
```
Persist and filter dataset tenancy + candidate identity in storage adapters. (#18314)
createDataset now persists organizationId, projectId, candidateKey, and candidateId. listDatasets and listItems accept matching tenancy filters. Dataset items inherit organizationId / projectId from their parent dataset on insert, update, delete, and batch insert/delete — items are never settable per call (item tenancy follows dataset tenancy).
All new columns are nullable and added retroactively via each adapter's existing column-migration path; no breaking DDL. Existing rows continue to read and write fine; new writes can choose to stamp tenancy.
```
await storage.createDataset({
  name: 'candidates/missing-tool-call/incident-123',
  organizationId: 'org_abc',
  projectId: 'project_xyz',
  candidateKey: 'missing-tool-call',
  candidateId: 'incident-123',
});

await storage.listDatasets({
  pagination: { page: 0, perPage: 20 },
  filters: { organizationId: 'org_abc', projectId: 'project_xyz' },
});
```
Fixed: mastra build output no longer hangs on the first storage-touching request when an app uses LibSQLStore, PostgresStore, or MySQLStore with observational memory. mastra dev was unaffected; only the bundled mastra start output deadlocked. No code changes or bundler.externals workaround required on the app side after upgrading. (#18302)

@mastra/nestjs@0.2.1

Patch Changes

Fixed workflow and agent HTTP streams silently dying when a stream chunk contained values that cannot be serialized to JSON (such as BigInt produced by zod coercions in structuredOutput schemas). In Studio this made workflow step nodes appear stuck in the "running" state even though the run completed successfully on the server. (#17843)
Unserializable values are now safely converted (BigInt to string, circular references to "[Circular]"). If a chunk still cannot be serialized at all, it is skipped with an error log that includes the route path and reason, instead of killing the stream and dropping all remaining chunks. Fixes #17821

@mastra/next@0.2.0

Minor Changes

Added @mastra/next — a server adapter for Next.js App Router. Drop your Mastra instance into a catch-all route to expose all Mastra API endpoints without manually wiring routes. (#18230)

Usage

// app/api/[...mastra]/route.ts
import { createNextRouteHandler } from '@mastra/next';
import { mastra } from '../../../mastra';

export const { GET, POST, PUT, DELETE, PATCH, OPTIONS, HEAD } = createNextRouteHandler({ mastra });

Patch Changes

@mastra/observability@1.15.1

Patch Changes

Fixed auto-extracted metrics (duration, token usage, cost) being silently dropped when spans are filtered via excludeSpanTypes or spanFilter. Previously, excluding a span type to reduce per-span costs in platforms like Langfuse also suppressed its aggregate metrics. Metrics are now emitted independently of span export filtering. (#18253)

@mastra/otel-exporter@1.3.1

Patch Changes

Fixed RAG embedding spans so OpenTelemetry exports include embedding model, provider, usage, span name, and client span kind metadata. (#17917)

@mastra/pg@1.14.1

Patch Changes

Fixed PostgresStore.init() failing with "RoutingDbClient already has a pinned client" when a single store is shared across concurrent requests (for example, request-scoped Mastra instances reusing one store/pool). Concurrent init() calls are now coalesced into a single shared initialization instead of each pinning the client. (#18336)
Also, init() is now a no-op when disableInit: true, so apps that manage their database schema externally are no longer forced through the connect-and-pin path.
Added multi-tenant scoping columns (organizationId, projectId) to the experiments domain so experiment records and per-item results inherit the tenancy bucket of their parent dataset. (#18388)
Experiment, ExperimentResult, CreateExperimentInput, and AddExperimentResultInput now carry optional organizationId / projectId fields. ListExperimentsInput and ListExperimentResultsInput gain a filters: ExperimentTenancyFilters block (mirrors DatasetTenancyFilters) for scoping queries within a (organizationId, projectId) bucket. Tenancy is hydrated from the parent dataset on createExperiment and denormalized onto each ExperimentResult for efficient tenancy-scoped queries.
The corresponding columns are also added to the mastra_experiments and mastra_experiment_results table schemas. Existing rows backfill to null, matching the rest of the dataset-tenancy surface.
This release also clarifies the targetType contract via JSDoc:
- CreateDatasetInput.targetType remains optional. Datasets without a TargetType are not experiment-eligible — the experiment runner requires a non-null CreateExperimentInput.targetType to resolve an executor.
- Experiment.targetType / CreateExperimentInput.targetType stay required. An experiment by definition replays inputs against a specific target.
No behavior change for existing OSS-created experiments; the new fields are additive and optional.
Example:
```
// Create an experiment scoped to a tenancy bucket. When the parent dataset
// already carries `organizationId` / `projectId`, `runExperiment` hydrates
// these fields automatically from the dataset record.
const experiment = await storage.createExperiment({
  name: 'qa-regression',
  datasetId: 'ds_123',
  datasetVersion: 1,
  targetType: 'agent',
  targetId: 'agent_qa',
  totalItems: 10,
  organizationId: 'org_123',
  projectId: 'proj_123',
});

// List experiments within a tenancy bucket.
const experiments = await storage.listExperiments({
  pagination: { page: 0, perPage: 20 },
  filters: { organizationId: 'org_123', projectId: 'proj_123' },
});

// List per-item results within the same bucket.
const results = await storage.listExperimentResults({
  experimentId: experiment.id,
  pagination: { page: 0, perPage: 50 },
  filters: { organizationId: 'org_123', projectId: 'proj_123' },
});
```
Persist and filter dataset tenancy + candidate identity in storage adapters. (#18314)
createDataset now persists organizationId, projectId, candidateKey, and candidateId. listDatasets and listItems accept matching tenancy filters. Dataset items inherit organizationId / projectId from their parent dataset on insert, update, delete, and batch insert/delete — items are never settable per call (item tenancy follows dataset tenancy).
All new columns are nullable and added retroactively via each adapter's existing column-migration path; no breaking DDL. Existing rows continue to read and write fine; new writes can choose to stamp tenancy.
```
await storage.createDataset({
  name: 'candidates/missing-tool-call/incident-123',
  organizationId: 'org_abc',
  projectId: 'project_xyz',
  candidateKey: 'missing-tool-call',
  candidateId: 'incident-123',
});

await storage.listDatasets({
  pagination: { page: 0, perPage: 20 },
  filters: { organizationId: 'org_abc', projectId: 'project_xyz' },
});
```
Fixed: mastra build output no longer hangs on the first storage-touching request when an app uses LibSQLStore, PostgresStore, or MySQLStore with observational memory. mastra dev was unaffected; only the bundled mastra start output deadlocked. No code changes or bundler.externals workaround required on the app side after upgrading. (#18302)
Added storage for item-level tool mocks. Dataset items persist their toolMocks and experiment results persist their toolMockReport, so mocks and run diagnostics survive across sessions. (#18036)

@mastra/playground-ui@36.0.0

Minor Changes

Added a CodeEditor option to disable line wrapping when callers need horizontally scrollable content. (#18303)
Improved command palette keyboard handling, input affordance/style slots, scroll-area-backed lists, shortcut labels, overlay behavior, and inline examples. (#18142)
Added a multiple-selection mode to the Combobox component and removed the separate MultiCombobox export. (#18166)
Use the shared Combobox API for both single and multiple selection:
```
<Combobox multiple value={selectedValues} onValueChange={setSelectedValues} options={options} />
```
Storybook now includes examples for the single and multiple selection flows. Command item icons now render without an extra icon background.

Added nested children support to MainSidebar.Sections navigation. Parent rows can now stay clickable while rendering child links as nested subitems. (#18224)

<MainSidebar.Sections
  sections={[
    {
      key: 'workspace',
      title: 'Workspace',
      links: [
        {
          name: 'Agents',
          url: '/agents',
          children: [{ name: 'Templates', url: '/agents/templates' }],
        },
      ],
    },
  ]}
/>

Removed the default DataList variant. DataList now uses the lined treatment when no variant is provided; use variant="striped" only when zebra rows are needed. (#18218)
Before
```
<DataList columns={columns} variant="default" />
```
After
```
<DataList columns={columns} />
```
Added helpers for working with remote and cloud-storage media URLs, used by the Studio agent chat composer so media can be attached by URL and forwarded to the model untouched instead of only being uploaded as inlined base64. (#18149)
- Recognizes cloud-storage URIs (gs://, s3://) so they are passed through and resolved server-side by the model provider.
- Recognizes video and audio URLs and renders them as a labeled file chip with the correct icon instead of a broken preview.
New exports:
```
import { isRemoteUrl, isBrowserFetchableUrl, isNonFetchableRemoteUrl } from '@mastra/playground-ui';

isRemoteUrl('gs://my-bucket/clip.mp4'); // true
isBrowserFetchableUrl('gs://my-bucket/clip.mp4'); // false (resolved server-side)
isBrowserFetchableUrl('https://example.com/clip.mp4'); // true
```

Patch Changes

Removed ScrollableContainer. Use ScrollArea with the scrollButtons prop for horizontal scroll controls. (#18220)

Before

import { ScrollableContainer } from '@mastra/playground-ui';

<ScrollableContainer>{items}</ScrollableContainer>;

After

import { ScrollArea } from '@mastra/playground-ui';

<ScrollArea orientation="horizontal" scrollButtons>
  {items}
</ScrollArea>;

Fixed DataList selection headers so grouped header cells no longer cover select-all checkboxes. (#18221)
Fixed line wrapping for JSON content in CodeEditor and tool result panels. Long JSON strings and tool outputs now wrap within the panel width instead of requiring horizontal scrolling. (#18214)
Fixed the Studio conversation copy button in browsers that block async clipboard writes. (#18268)
Added Studio support for authoring and viewing item-level tool mocks on dataset items. (#18038)
Added trace-derived mock creation with an editable preview before saving to a new or existing item.
Added tool mock propagation when creating new items and creating datasets from items.
Improved experiment results with a tool mock report (served, unconsumed, live calls, and mismatch details).
Author tool mocks on a dataset item as a JSON array:
```
[
  {
    "toolName": "refundUser",
    "args": { "user": "YJ", "amount": 100 },
    "output": { "refundId": "refund_1", "user": "YJ", "amount": 100, "newBalance": 100 }
  }
]
```

@mastra/redis-streams@0.2.0

Minor Changes

Collapsed the result of agent.sendSignal / sendMessage / queueMessage / sendStateSignal / sendNotificationSignal into a single accepted promise that resolves at decision-time to a discriminated union: { action: 'wake'; runId; output } when the signal started a run in this process, { action: 'deliver'; runId } when it was forwarded onto an existing run, or { action: 'persist' } / { action: 'discard' } when nothing ran. runId is present only on wake/deliver; for persist/discard correlate via result.signal.id. accepted resolves for routing decisions and rejects only when the signal couldn't be routed at all (e.g. misconfigured agent). This replaces the old accepted: true boolean and the best-effort top-level runId. result.persisted stays top-level. sendNotificationSignal's accepted is optional (a notification may be dropped by policy with no signal) — read result.decision for the policy verdict. (#17723)
On serverless platforms, when this process/Lambda is the one that woke the run (action === 'wake'), it can use the platform's waitUntil to keep itself alive until the run it started completes — otherwise the runtime may be frozen or torn down mid-run:
```
const result = agent.sendSignal(signal, { resourceId, threadId });
ctx.waitUntil(
  result.accepted.then(async accepted => {
    if (accepted.action === 'wake') await accepted.output.consumeStream();
  }),
);
```
Added a LeaseProvider capability (acquireLease / releaseLease / renewLease / getLeaseOwner / transferLease) — distributed leasing kept separate from event delivery (PubSub) — so processes racing to wake the same thread coordinate a single owner. The winner runs the stream; losers forward their signal to it. EventEmitterPubSub leases in-memory; RedisStreamsPubSub uses SET NX PX with owner-verified Lua scripts for release, renew, and transfer; the default NoopLeaseProvider always wins, preserving single-process behavior.
Fixed a cross-process race where, after a run finished, the thread's lease briefly went free before its queued follow-up work started — letting another process win the lease and start a competing run for the same thread. The owner now keeps the lease until all queued work is drained. The lease TTL and renewal interval are overridable for tests via MASTRA_AGENT_THREAD_LEASE_TTL_MS and MASTRA_AGENT_THREAD_LEASE_RENEW_INTERVAL_MS (production defaults unchanged).

Patch Changes

Honor the localOnly publish option so in-process subscribers can receive events without round-tripping through the broker. (#17836)
This matches the contract already implemented by UnixSocketPubSub in @mastra/core: when Mastra tags an internal workflow event as localOnly, the payload is delivered by reference to local subscribers and the broker is skipped entirely. Live runtime values like MastraModelOutput instances now keep their prototypes when the evented agent loop runs against a Redis Streams or Google Cloud Pub/Sub broker, fixing output.consumeStream is not a function style failures.

@mastra/server@1.46.0

Minor Changes

Expose Harness sessions over HTTP. (#18358)
Adds a set of harness-scoped server routes that let a registered Harness be
driven over HTTP: create (get-or-create) a session by resourceId, send
messages, steer, abort, approve/decline tool calls, respond to tool
suspensions, switch mode/model, manage threads, read session state, and
subscribe to the session's event stream via SSE. Routes resolve the target
Harness through mastra.getHarness(id) and operate on the session returned by
harness.createSession(...).
A new harness permission resource is included (harness:read,
harness:execute).
The tool-approval route forwards the request's toolCallId so a stale or
delayed approval can only resolve the gate it targets, and the list-models
route no longer returns API key environment variable names.
Added the AUTO_BLOCK_EXTERNAL_PROVIDERS environment variable. When set to true or 1, Mastra Studio hides all external model providers (OpenAI, Anthropic, Gemini, etc.) and the built-in gateways, showing only the custom gateways you register. This lets enterprise deployments that route through their own gateway present just that gateway in the model picker. (#18153)
```
AUTO_BLOCK_EXTERNAL_PROVIDERS=true
```

Patch Changes

The HTTP signal/message routes adapt to the agent's new accepted contract: they await accepted to derive the authoritative runId (falling back to the caller's runId or the stored signal id for persist/discard) while preserving the { accepted: true; runId: string } wire shape. A setup/misconfig rejection tagged ErrorCategory.USER now maps to a 400 instead of a generic 500. (#18237)
Extend datasetItemSourceSchema enum with 'candidate-screener'. (#18314)
The server's Zod schema for dataset item sources mirrored the closed DatasetItemSource['type'] union from @mastra/core. Now that core extends the union with 'candidate-screener', the server schema follows so HTTP handlers can compile against the new core types and the API can round-trip externally-materialized items.
Expose item-level tool mocks through the dataset API and client SDK. Dataset item create/update/batch endpoints accept a toolMocks array (toolName + args + output + optional matchArgs mode), experiment result responses include the toolMockReport, and the client-js types thread toolMocks and toolMockReport through the dataset item and experiment result types. (#18037)
```
// Author a dataset item with a tool mock the agent will replay during experiments
await client.addDatasetItem({
  datasetId,
  input: { question: 'What is the weather in Tokyo?' },
  toolMocks: [{ toolName: 'getWeather', args: { city: 'Tokyo' }, output: { tempC: 18 } }],
});
```
Added a serializeStreamChunk helper to @mastra/server/server-adapter that server adapters use to safely serialize stream chunks. It converts values that JSON cannot represent (BigInt to string, circular references to "[Circular]") and reports a serialization error instead of throwing, so one bad chunk can no longer terminate an HTTP stream. Part of the fix for #17821 (#17843)

@mastra/spanner@1.2.1

Patch Changes

Added multi-tenant scoping columns (organizationId, projectId) to the experiments domain so experiment records and per-item results inherit the tenancy bucket of their parent dataset. (#18388)
Experiment, ExperimentResult, CreateExperimentInput, and AddExperimentResultInput now carry optional organizationId / projectId fields. ListExperimentsInput and ListExperimentResultsInput gain a filters: ExperimentTenancyFilters block (mirrors DatasetTenancyFilters) for scoping queries within a (organizationId, projectId) bucket. Tenancy is hydrated from the parent dataset on createExperiment and denormalized onto each ExperimentResult for efficient tenancy-scoped queries.
The corresponding columns are also added to the mastra_experiments and mastra_experiment_results table schemas. Existing rows backfill to null, matching the rest of the dataset-tenancy surface.
This release also clarifies the targetType contract via JSDoc:
- CreateDatasetInput.targetType remains optional. Datasets without a TargetType are not experiment-eligible — the experiment runner requires a non-null CreateExperimentInput.targetType to resolve an executor.
- Experiment.targetType / CreateExperimentInput.targetType stay required. An experiment by definition replays inputs against a specific target.
No behavior change for existing OSS-created experiments; the new fields are additive and optional.
Example:
```
// Create an experiment scoped to a tenancy bucket. When the parent dataset
// already carries `organizationId` / `projectId`, `runExperiment` hydrates
// these fields automatically from the dataset record.
const experiment = await storage.createExperiment({
  name: 'qa-regression',
  datasetId: 'ds_123',
  datasetVersion: 1,
  targetType: 'agent',
  targetId: 'agent_qa',
  totalItems: 10,
  organizationId: 'org_123',
  projectId: 'proj_123',
});

// List experiments within a tenancy bucket.
const experiments = await storage.listExperiments({
  pagination: { page: 0, perPage: 20 },
  filters: { organizationId: 'org_123', projectId: 'proj_123' },
});

// List per-item results within the same bucket.
const results = await storage.listExperimentResults({
  experimentId: experiment.id,
  pagination: { page: 0, perPage: 50 },
  filters: { organizationId: 'org_123', projectId: 'proj_123' },
});
```
Persist and filter dataset tenancy + candidate identity in storage adapters. (#18314)
createDataset now persists organizationId, projectId, candidateKey, and candidateId. listDatasets and listItems accept matching tenancy filters. Dataset items inherit organizationId / projectId from their parent dataset on insert, update, delete, and batch insert/delete — items are never settable per call (item tenancy follows dataset tenancy).
All new columns are nullable and added retroactively via each adapter's existing column-migration path; no breaking DDL. Existing rows continue to read and write fine; new writes can choose to stamp tenancy.
```
await storage.createDataset({
  name: 'candidates/missing-tool-call/incident-123',
  organizationId: 'org_abc',
  projectId: 'project_xyz',
  candidateKey: 'missing-tool-call',
  candidateId: 'incident-123',
});

await storage.listDatasets({
  pagination: { page: 0, perPage: 20 },
  filters: { organizationId: 'org_abc', projectId: 'project_xyz' },
});
```
Added storage for item-level tool mocks. Dataset items persist their toolMocks and experiment results persist their toolMockReport, so mocks and run diagnostics survive across sessions. (#18036)

@mastra/tanstack-start@0.2.0

Minor Changes

Added @mastra/tanstack-start server adapter for mounting Mastra in TanStack Start apps via a catch-all server route. Uses the same Hono-based MastraServer pattern as @mastra/next. (#18270)

Patch Changes

@mastra/vercel@1.1.1

Patch Changes

Added warnings to VercelSandbox and VercelMicroVMSandbox class names. VercelSandbox will be renamed to VercelServerlessSandbox and VercelMicroVMSandbox will be renamed to VercelSandbox in a future release. (#18296)

@mastra/voice-google-gemini-live@0.14.0

Minor Changes

Added sendContext() method to GeminiLiveVoice for seeding conversation history into a fresh voice session without triggering a model response. This lets apps replay prior turns from Mastra Memory (or any external store) on a cold connect so the model has full context before the user speaks — enabling seamless handoff between text chat and voice on a shared thread. (#18286)
Usage:
```
await voice.connect();

await voice.sendContext([
  { role: 'user', content: 'What is the weather?' },
  { role: 'assistant', content: 'It is 72°F in San Francisco.' },
]);

// Model stays silent until the user actually speaks
await voice.send(micStream);
```

Patch Changes

Fix resumeSession() always timing out. Session resumption now works end-to-end: new sessions request server-issued tokens, inbound handles are stored and emitted, and resuming reconnects with the correct handle in the setup frame. (#18190)
Fix sendContext() being rejected (WS 1007) on gemini-3.1-flash-live-preview by emitting history_config: { initial_history_in_client_content: true } in the setup frame. Also exposes initialHistoryInClientContent on GeminiSessionConfig so callers can opt out explicitly. (#18368)
Fixed realtime audio streaming being immediately rejected by the Gemini Live API. Audio frames now use the current API format, replacing a deprecated payload shape that caused the connection to close on the first frame. (#18291)
The session event for disconnections now includes code and reason fields, so consumers can see why the server closed the connection.

Other updated packages

The following packages were updated with dependency changes only:

mastra-ai/mastra @mastra/core@1.46.0 June 24, 2026 on GitHub

Highlights

Multi-session Harness architecture (Session-first APIs)

Drive Harness sessions over HTTP (and from the JS client)

Cross-process signal delivery with distributed leasing + new accepted contract

Deterministic agent experiments with item-level tool mocks + multi-tenant datasets/experiments

New integrations & streaming ergonomics

Breaking Changes

Changelog

Minor Changes

Patch Changes

Minor Changes

Patch Changes

Patch Changes

Minor Changes

Patch Changes

Patch Changes

Patch Changes

Patch Changes

Patch Changes

Patch Changes

Patch Changes

Patch Changes

Patch Changes

Minor Changes

Patch Changes

Patch Changes

Patch Changes

Patch Changes

Minor Changes

Patch Changes

Patch Changes

Minor Changes

Patch Changes

Minor Changes

Patch Changes

Patch Changes

Minor Changes

Patch Changes

Patch Changes

Patch Changes

Patch Changes

Minor Changes

Patch Changes

Minor Changes

Patch Changes

Minor Changes

Patch Changes

Patch Changes

Minor Changes

Patch Changes

Patch Changes

Minor Changes

Patch Changes

Other updated packages

mastra-ai/mastra @mastra/core@1.46.0
June 24, 2026

on GitHub

Cross-process signal delivery with distributed leasing + new `accepted` contract