github dolthub/dolt v1.81.7
1.81.7

11 hours ago

Merged PRs

dolt

  • 10457: gitblobstore: implement Concatenate for NBS compatibility
    Implements GitBlobstore.Concatenate with CAS-safe commits, chunked output support via MaxPartSize, and end-to-end tests to enable NBS blobstore persister/conjoin paths.
  • 10456: go: sqle/dsess: transactions.go: When serializing transaction commits against a working set, form the key with the normalized db name.
    Previously, this lock would accidentally allow concurrent access to writing the database working set value because a non-normalized database name like db/main\x00/refs/heads/main would allow access along with a normalized database name like db\x00/refs/heads/main. This did not impact correctness, since the working sets are safe for concurrent modification at the storage layer, but it could cause transient failures for a client if the optimistic lock retries failed sequentially enough times.
    Here we fix the bug so that the txLocks serialize access to the ref heads as expected.
  • 10455: Update DoltTable.ProjectedTags() to distinguish between no set projectedCols and zero projectedCols
    fixes #10451
    When getting ProjectedTags(), we were not distinguishing between when projectedCols was nil (meaning no projections are set so we should return all columns) and when projectedCols was an empty array of length 0 (meaning the table has been pruned to be zero columns but we still care about the number of rows), since in both cases, projectedCols would have a length of 0. This was causing LEFT OUTER JOINs that didn't project any left-side columns to not return the correct number of columns. This was fixed by checking for if projectedCols was nil instead (which is what we do in other functions like Projections() and HistoryTable.ProjectedTags()
    Also some minor refactorings:
  • 10444: Optimize BlockOnLock retry loop to eliminate per-iteration allocations
    Addresses feedback on #10442 to reduce GC churn in the BlockOnLock retry loop. The original implementation allocated a new timer on each iteration via time.After(), causing unnecessary memory pressure when locks are held for extended periods.

    Changes

    • Added lockRetryInterval constant: Extracts hardcoded 10ms retry interval into a named constant for clarity and tunability
    • Replaced time.After with lazy-initialized time.Ticker: Single ticker instance reused across all retries, eliminated per-iteration allocations
    • Optimized fast path: Ticker creation deferred until first lock failure, avoiding overhead when lock is immediately available

    Before

    for {
    err = lock.TryLock()
    if err == nil {
    break
    }
    select {
    case <-ctx.Done():
    return nil, ctx.Err()
    case <-time.After(10 * time.Millisecond):  // New allocation each iteration
    }
    }

    After

    var ticker *time.Ticker
    defer func() {
    if ticker != nil {
    ticker.Stop()
    }
    }()
    for {
    err = lock.TryLock()
    if err == nil {
    break
    }
    if ticker == nil {
    ticker = time.NewTicker(lockRetryInterval)  // Allocate once, reuse
    }
    select {
    case <-ctx.Done():
    return nil, ctx.Err()
    case <-ticker.C:
    }
    }

    💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.
  • 10433: fix variable name
  • 10430: add benchmark mini command
  • 10424: blobstore: add chunked-object mode to GitBlobstore
    This PR introduces GitBlobstore, a Blobstore implementation backed by a git repository’s object database (bare repo or .git dir). Keys
    are stored as paths in the tree of a commit pointed to by a configured ref (e.g. refs/dolt/data), enabling Dolt remotes to be hosted on
    standard git remotes.
    High-level design
    • Storage model
    • Each blobstore key maps to a git tree path under the ref’s commit.
    • Small objects are stored as a single git blob at .
    • Large objects (when chunking enabled) are stored as a git tree at containing part blobs:
    • /00000001, /00000002, … (lexicographically ordered)
    • No descriptor header / no stored total size; size is derived by summing part blob sizes.
    • Roll-forward only: this PR supports the above formats; it does not include backward-compat for any older descriptor-based chunking
    formats.
    • Per-key versioning
    • Get/Put/CheckAndPut return a per-key version equal to the object id at :
    • inline: blob OID
    • chunked: tree OID
    • Idempotent Put
    • For non-manifest keys, Put fast-succeeds if already exists (assumes content-addressed semantics common in NBS/table files),
    returning the existing per-key version without consuming the reader.
    • manifest remains mutable and is updated via CheckAndPut.
    CheckAndPut semantics
    • CheckAndPut performs CAS against the current per-key version at (not against the HEAD commit hash).
    • Implementation uses a ref-level CAS retry loop:
    • re-checks version at current HEAD
    • only consumes/hashes the reader after the expected version matches
    • retries safely if the ref advances due to unrelated updates
    • Blob↔tree transitions
    • Handles transitions between inline blob and chunked tree representations by proactively removing conflicting index paths before
    staging new entries (avoids git index file-vs-directory conflicts).
    Internal git plumbing additions
    Adds/uses a unified internal GitAPI abstraction to support:
    • resolving path objects and types (blob vs tree)
    • listing tree entries for chunked reads
    • removing paths from the index in bare repos
    • staging and committing new trees, with configurable author/committer identity fallback
  • 10419: GitBlobstore: implement CheckAndPut CAS semantics + add tests
    This PR adds the next write-path primitive to GitBlobstore: CheckAndPut with proper compare-and-swap behavior, and a focused test suite (including a concurrency/CAS-failure scenario).
  • 10418: Various bug fixes for checking foreign key constraints during merge
    This PR mainly addresses the need to perform type conversions when performing index lookups when determining whether a diff introduces a foreign key constraint violation. The old code assumed that the key values were binary identical between parent and child table, and this isn't always the case (esp in Doltgres).
    Also fixes a related bug in constructing the primary key from a secondary key, which occurs when a secondary index contains primary key columns.

go-mysql-server

  • 3419: Use placeholder ColumnIds for EmptyTable
    Fixes #10434
    EmptyTable implements TableIdNode so it was using Columns() to get the ColumnIds. EmptyTable.WithColumns() is only ever called for testing purposes; as a result, the ColSet returned is empty. This causes the column to ColumnId mapping to be incorrectly off set, leading to the wrong index id assigned.
    This fix adds a case for EmptyTable in columnIdsForNode to add placeholder ColumnId values so the mappings are correctly aligned. I considered setting the actual ColSet for EmptyTable but there's actually not a good way to do that. Regardless, the index id will be set either using the name of the column or using the Projector node that wraps the EmptyTable.
    Similar to SetOp, EmptyTable probably shouldn't be a TableIdNode (see #10443)
  • 3417: Do not join remaining tables with CrossJoins during buildSingleLookupPlan
    fixes #10304
    Despite what the comment said, it's not safe to join remaining tables with CrossJoins during buildSingleLookupPlan. It is only safe to do so if every filter has been successfully matched to currentlyJoinedTables. Otherwise, we end up dropping filters.
    For example, we could have a query like select from A, B, inner join C on B.c0 <=> C.c0 where table A has a primary key and tables B and C are keyless. columnKey matches A's primary key column and A would be added to currentlyJoinedTables. Since the only filter references B and C and neither are part of currentlyJoinedTabes, nothing is ever added to joinCandidates. However, it's unsafe to join all the tables with CrossJoins because we still need to account for the filter on B and C.
  • 3416: allow Doltgres to add more information schema tables
  • 3415: Simplify Between expressions for GetField arguments
    fixes #10284
    part of #10340
    benchmarks

Closed Issues

  • 10451: Incorrect indexes in left outer lookup join
  • 9705: clone failed; dangling ref: found dangling references to HashSet
  • 10356: Verify and add tests for renaming/altering temporary tables
  • 10434: ColumnId's not added for EmptyTable during assignExecIndexes, causing offset when getting column index
  • 10304: Invalid CrossJoin in query plan
  • 10284: Avoid RangeHeapJoin when lower and upper bounds are the same field

Don't miss a new dolt release

NewReleases is sending notifications on new releases.