github microsoft/FluidFramework server_v5.0.0
Routerlicious v5.0.0

latest releases: build-tools_v0.47.0, client_v2.3.1, client_v2.2.2...
3 months ago

server-services-core: New configuration setting for ephemeral container soft delete

IDeliServerConfiguration defines a new optional property, ephemeralContainerSoftDeleteTimeInMs, that controls when ephemeral containers are soft-deleted.

You can find more details in pull request #20731.

server-services-core: New optional dispose method

Adds optional dispose method to IWebSocket for disposing event listeners on disconnect in Nexus lambda.

You can find more details in pull request #21211.

server-services-core: Reduce session grace period for ephemeral containers to 2 minutes (was 10 minutes)

For ephermeral container, the session grace period is reduced from 10 minutes to 2 minutes when cluster is draining. This ensures the ephemeral container gets cleaned after disconnection sooner. Clients will not find old EH containers and will need to create new containers. This logic only takes effect when forcing draining.

You can find more details in pull request #21010.

server-lambdas: Nexus client connections can now disconnect in batches

Added the option to make Nexus client connections disconnect in batches. The new options are within socketIo element of the Nexus config:

  • gracefulShutdownEnabled (true or false)
  • gracefulShutdownDrainTimeMs (overall time for disconnection)
  • gracefulShutdownDrainIntervalMs (how long each batch has to disconnect)

Additionally, the DrainTimeMs setting should be set to a value greater than the setting shared:runnerServerCloseTimeoutMs which governs how long Alfred and Nexus have to shutdown.

You can find more details in pull request #19938.

server-lambdas: Performance: Keep pending checkpoint message for future summaries

During a session there may be multiple client/service summary calls, and independently, multiple checkpoints. Checkpoint will clear messages storage in pendingCheckpointMessages, which is also used for writing summaries. Because of this cleanup, when we write new summaries, it often needs to request the ops from Alfred again, which is not quite efficient.

Now the pending messages are cached for improved performance.

You can find more details in pull request #20029.

server-services-core: Fix: Limit max length of validParentSummaries in checkpoints

Limits maximum number of tracked valid parent (service) summaries to 10 by default. Configurable via IScribeServerConfiguration in scribe property of IServiceConfiguration.

You can find more details in pull request #20850.

server-lambdas: Fix: send correct connection scopes for client

When a client joins in "write" mode with only "read" scopes in their token, the connection message from server will reflect a "read" client mode.

You can find more details in pull request #20312.

protocol-base: Fix: ensure immutability of quorum snapshot

Creates a deeper clone of the quorum members when snapshotting to make sure the snapshot is immutable.

You can find more details in pull request #20329.

server-lambdas: Fix: cover edge cases for scrubbed checkpoint users

Overhauled how the Scribe lambda handles invalid, missing, or outdated checkpoint data via fallbacks.

Before:

if (no global checkpoint)
  use Default checkpoint
elsif (global checkpoint was cleared or  global checkpoint quorum was scrubbed)
  use Summary checkpoint
else
  use latest DB checkpoint (local or global)

After:

if (no global and no local checkpoint and no summary checkpoint)
  use Default checkpoint
elsif (
	global checkpoint was cleared and summary checkpoint ahead of local db checkpoint
	or latest DB checkpoint quorum was scrubbed
	or summary checkpoint ahead of latest DB checkpoint
)
  use Summary checkpoint
else
  use latest DB checkpoint (local or  global)

Also: Updated CheckpointService with additional fallback logic for loading a checkpoint from local or global DB depending on whether the quorum information in the checkpoint is valid (i.e. does not contain scrubbed users).

You can find more details in pull request #20259.

server-routerlicious-base: Add support for custom tenant key generators

Added support to add a custom tenant key generator instead of using just the default 128-bit sha256 key.

You can find more details in pull request #20844.

server-routerlicious-base: Remove Riddler HTTP request for performance

The getOrderer workflow no longer calls getTenant when globalDb is enabled. This saves two HTTP calls to Riddler and will improve performance.

You can find more details in pull request #20773.

protocol-base: Fix: configure user data scrubbing in checkpoints and summaries

Note: This change is primarily internal to routerlicious.

  • When scribe boots from a checkpoint, it fails over to the latest summary checkpoint if the quorum is corrupted (i.e. user data is scrubbed).

  • When scribe writes a checkpoint to DB or a summary, it respects new IScribeServerConfiguration options (scrubUserDataInSummaries, scrubUserDataInLocalCheckpoints, and scrubUserDataInGlobalCheckpoints) when determining whether to scrub user data in the quorum.

  • Added optional param, scrubUserData, to ProtocolOpHandler.getProtocolState(). When true, user data in the quorum is replaced with { id: "" }. Defaults to false. Previously was always scrubbed.

  • Added the following configuration options for IScribeServerConfiguration:

    • scrubUserDataInSummaries
    • scrubUserDataInLocalCheckpoints
    • scrubUserDataInGlobalCheckpoints

    All default to false.

You can find more details in pull request #20150.

server-services-utils: Add support for custom authentication with Redis

Added support for custom authentication with Redis instead of only password based authentication. This includes support for Microsoft Entra-ID based authentication for Redis.

You can find more details in pull request #20214.

server-services-client: Add optional internalErrorCode property to NetworkError and INetworkErrorDetails

NetworkErrors now include an optional property, internalErrorCode, which can contain additional information about the internal error.

You can find more details in pull request #21429.

server-services-shared: Fixed the ordering in Nexus shutdown

Before, the Redis Pub/Sub would be disposed before the socket connections were closed. Now we first close socket connections then do Redis disposal.

You can find more details in pull request #20429.

server-lambdas, server-services-core: SessionStartMetric removed from Scribe and Deli microservices

This change removes the SessionStartMetric from Scribe and Deli. The metric is a source of bugs and has been superseded by the restoreFromCheckpoint and RunService metrics.

You can find more details about the reasons for this change in pull request #21125.

Don't miss a new FluidFramework release

NewReleases is sending notifications on new releases.