Merged PRs
dolt
- 10457: gitblobstore: implement Concatenate for NBS compatibility
Implements GitBlobstore.Concatenate with CAS-safe commits, chunked output support via MaxPartSize, and end-to-end tests to enable NBS blobstore persister/conjoin paths. - 10456: go: sqle/dsess: transactions.go: When serializing transaction commits against a working set, form the key with the normalized db name.
Previously, this lock would accidentally allow concurrent access to writing the database working set value because a non-normalized database name likedb/main\x00/refs/heads/mainwould allow access along with a normalized database name likedb\x00/refs/heads/main. This did not impact correctness, since the working sets are safe for concurrent modification at the storage layer, but it could cause transient failures for a client if the optimistic lock retries failed sequentially enough times.
Here we fix the bug so that the txLocks serialize access to the ref heads as expected. - 10455: Update
DoltTable.ProjectedTags()to distinguish between no setprojectedColsand zeroprojectedCols
fixes #10451
When gettingProjectedTags(), we were not distinguishing between whenprojectedColswasnil(meaning no projections are set so we should return all columns) and whenprojectedColswas an empty array of length 0 (meaning the table has been pruned to be zero columns but we still care about the number of rows), since in both cases,projectedColswould have a length of 0. This was causingLEFT OUTER JOINs that didn't project any left-side columns to not return the correct number of columns. This was fixed by checking for ifprojectedColswasnilinstead (which is what we do in other functions likeProjections()andHistoryTable.ProjectedTags()
Also some minor refactorings:- renamed
getItatogetIndexedTableAccess - removed unused variables returned by
getSourceKv
Test added in dolthub/go-mysql-server#3424
- renamed
- 10444: Optimize BlockOnLock retry loop to eliminate per-iteration allocations
Addresses feedback on #10442 to reduce GC churn in theBlockOnLockretry loop. The original implementation allocated a new timer on each iteration viatime.After(), causing unnecessary memory pressure when locks are held for extended periods.Changes
- Added
lockRetryIntervalconstant: Extracts hardcoded 10ms retry interval into a named constant for clarity and tunability - Replaced
time.Afterwith lazy-initializedtime.Ticker: Single ticker instance reused across all retries, eliminated per-iteration allocations - Optimized fast path: Ticker creation deferred until first lock failure, avoiding overhead when lock is immediately available
Before
for { err = lock.TryLock() if err == nil { break } select { case <-ctx.Done(): return nil, ctx.Err() case <-time.After(10 * time.Millisecond): // New allocation each iteration } }
After
var ticker *time.Ticker defer func() { if ticker != nil { ticker.Stop() } }() for { err = lock.TryLock() if err == nil { break } if ticker == nil { ticker = time.NewTicker(lockRetryInterval) // Allocate once, reuse } select { case <-ctx.Done(): return nil, ctx.Err() case <-ticker.C: } }
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs. - Added
- 10433: fix variable name
- 10430: add benchmark mini command
- 10424: blobstore: add chunked-object mode to GitBlobstore
This PR introducesGitBlobstore, a Blobstore implementation backed by a git repository’s object database (bare repo or .git dir). Keys
are stored as paths in the tree of a commit pointed to by a configured ref (e.g. refs/dolt/data), enabling Dolt remotes to be hosted on
standard git remotes.
High-level design
• Storage model
• Each blobstore key maps to a git tree path under the ref’s commit.
• Small objects are stored as a single git blob at .
• Large objects (when chunking enabled) are stored as a git tree at containing part blobs:
• /00000001, /00000002, … (lexicographically ordered)
• No descriptor header / no stored total size; size is derived by summing part blob sizes.
• Roll-forward only: this PR supports the above formats; it does not include backward-compat for any older descriptor-based chunking
formats.
• Per-key versioning
• Get/Put/CheckAndPut return a per-key version equal to the object id at :
• inline: blob OID
• chunked: tree OID
• IdempotentPut
• For non-manifestkeys, Put fast-succeeds if already exists (assumes content-addressed semantics common in NBS/table files),
returning the existing per-key version without consuming the reader.
• manifest remains mutable and is updated via CheckAndPut.
•CheckAndPutsemantics
• CheckAndPut performs CAS against the current per-key version at (not against the HEAD commit hash).
• Implementation uses a ref-level CAS retry loop:
• re-checks version at current HEAD
• only consumes/hashes the reader after the expected version matches
• retries safely if the ref advances due to unrelated updates
• Blob↔tree transitions
• Handles transitions between inline blob and chunked tree representations by proactively removing conflicting index paths before
staging new entries (avoids git index file-vs-directory conflicts).
Internal git plumbing additions
Adds/uses a unified internal GitAPI abstraction to support:
• resolving path objects and types (blob vs tree)
• listing tree entries for chunked reads
• removing paths from the index in bare repos
• staging and committing new trees, with configurable author/committer identity fallback - 10419: GitBlobstore: implement
CheckAndPutCAS semantics + add tests
This PR adds the next write-path primitive to GitBlobstore:CheckAndPutwith proper compare-and-swap behavior, and a focused test suite (including a concurrency/CAS-failure scenario). - 10418: Various bug fixes for checking foreign key constraints during merge
This PR mainly addresses the need to perform type conversions when performing index lookups when determining whether a diff introduces a foreign key constraint violation. The old code assumed that the key values were binary identical between parent and child table, and this isn't always the case (esp in Doltgres).
Also fixes a related bug in constructing the primary key from a secondary key, which occurs when a secondary index contains primary key columns.
go-mysql-server
- 3419: Use placeholder
ColumnIds forEmptyTable
Fixes #10434
EmptyTableimplementsTableIdNodeso it was usingColumns()to get theColumnIds.EmptyTable.WithColumns()is only ever called for testing purposes; as a result, theColSetreturned is empty. This causes the column toColumnIdmapping to be incorrectly off set, leading to the wrong index id assigned.
This fix adds a case forEmptyTableincolumnIdsForNodeto add placeholderColumnIdvalues so the mappings are correctly aligned. I considered setting the actualColSetforEmptyTablebut there's actually not a good way to do that. Regardless, the index id will be set either using the name of the column or using the Projector node that wraps the EmptyTable.
Similar toSetOp,EmptyTableprobably shouldn't be aTableIdNode(see #10443) - 3417: Do not join remaining tables with CrossJoins during buildSingleLookupPlan
fixes #10304
Despite what the comment said, it's not safe to join remaining tables with CrossJoins duringbuildSingleLookupPlan. It is only safe to do so if every filter has been successfully matched tocurrentlyJoinedTables. Otherwise, we end up dropping filters.
For example, we could have a query likeselect from A, B, inner join C on B.c0 <=> C.c0where table A has a primary key and tables B and C are keyless.columnKeymatches A's primary key column and A would be added tocurrentlyJoinedTables. Since the only filter references B and C and neither are part ofcurrentlyJoinedTabes, nothing is ever added tojoinCandidates. However, it's unsafe to join all the tables with CrossJoins because we still need to account for the filter on B and C. - 3416: allow Doltgres to add more information schema tables
- 3415: Simplify Between expressions for GetField arguments
fixes #10284
part of #10340
benchmarks
Closed Issues
- 10451: Incorrect indexes in left outer lookup join
- 9705: clone failed; dangling ref: found dangling references to HashSet
- 10356: Verify and add tests for renaming/altering temporary tables
- 10434:
ColumnId's not added forEmptyTableduringassignExecIndexes, causing offset when getting column index - 10304: Invalid CrossJoin in query plan
- 10284: Avoid RangeHeapJoin when lower and upper bounds are the same field