Merged PRs
dolt
- 4408: reorder arguments of dolt_diff() table function
This PR updates argument ordering ofdolt_diff()
table function to matchdolt_diff_summary()
table function, which matches CLI command argument order. - 4401: go/commands: Add sql-server option for validating query results against a MySQL instance
depends on dolthub/go-mysql-server#1279 - 4400: allow drop current database directory and fix drop db with '-'
This PR fixes two issues:- if the server is running in the database itself, the database can be dropped regardless of being selected database.
- drop database with '-' in dir name and '_' in database name.
- 4399: adding prepared tests for dolt stored procedures
Fix for: #4392 - 4397: print spatial types as hex in shells only
Related PR: dolthub/go-mysql-server#1278
Since theSQL
methods forPoint
,LineString
,Polygon
, andGeometry
no longer return the hex formatted string, need to do a little extra work for prettier shell output. - 4394: sql-server: Implement dolt_cluster_role, dolt_cluster_role_epoch variables, dolt_assume_cluster_role procedure.
This is just the variable persistence and some validation and rules around how they can transition. - 4389: Fix race for concurrent dolt_commit calls
This fixes #4361 - 4387: remotestorage,remotesrv: Be less restrictive in repository paths for remotesapi repos that are not served through dolthub.
This uses the repo_path parameter to allow single and 3+ path component paths for repository names. - 4386: go/cmd/dolt/commands/sqlserver: yaml_config.go: Parse and validate cluster: config.
- 4385: Updated bats README and a test to make it easier to get started
Also improved parquet tests - 4384: Implement merge stats for the new format
- 4383: go/libraries/doltcore/sqle: ReadReplicaDatabase: Add variable for force pulling remote branches.
This adds@@global.dolt_read_replica_force_pull
, which will turn read replica fetches into force pulls, allowing a read replica to true-up with a remote which has experienced a force push. - 4363: support dolt_diff_summary_table_function
dolt_diff_summary
table function is the same as CLIdolt diff --summary
command. It takes fromCommit, toCommit and tableName, which is optional.
If tableName is not defined, the table function will output all tables with data diff.
If the table does not have data diff, it will output empty result.
If the table is keyless, like CLI command, only row added and row deleted will have appropriate information, and the rest is nil.
Current issues:- drop-column causes row values to be modified into NULL, it should be different from user-defined NULL or NULL added from add-column.
- for keyless table, update table causes delete old row and insert new row, which affects the row added and deleted result to be incorrect.
- 4268: Feat/support aliyun oss store
support aliyun oss as backend store
go-mysql-server
- 1281: Alias resolution updates
This change moves our alias identification and column name qualification code closer to MySQL's behavior. The major changes are:- identify available alias, table, and column names on a per-scope level (instead of the previous "nestingLevel" that was based off of the node depth of the execution plan tree)
- supply additional context to the
qualifyExpression
method so that the containing node can be used to switch on different alias visibility rules
There are still more nuanced edge cases in MySQL's alias behavior that we should continue working on matching (e.g. dolthub/go-mysql-server#1285, dolthub/go-mysql-server#1286), but this change moves us a nice step forward.
Dolt CI Tests: - #4410
Fixes: - dolthub/go-mysql-server#525
- #4344
- 1280: More join types
AddFullOuterJoin
,SemiJoin
, andAntiJoin
.
None of these new join nodes are safe for join ordering transformations yet. They are explicitly excluded from join planning.
The getField indexes for these three nodes' join conditions deserve more consideration. I excluded them from auto-fixup after manually correcting join condition get fields for the appropriate schemas.
FullOuterJoin
uses a union distinct execution operator, which is correct but a lot slower than a merge join-esque operator.
SemiJoin
andAntiJoin
rearrange subquery expression scopes. I separateresolve
andfinalizeSubqueryExpressions
to perform decorrelation before predicate pushdown (where we were panicking on FixUpExpressions) and join ordering (we want to decorrelate scopes before join planning).
Other:- query plan tests added for exist hoisting edge cases i did not catch on first pass
- fixed bug with CTE stars
- 1279: Andy/mysql validator
- 1278: should not be returning hex on the wire for geometry types
Fix for: #4390
For improved display purposes, I changed the SQL method to convert the raw binary of spatial types to their hex equivalent (dolthub/go-mysql-server#1068).
DBeaver expects the binary in WKB format to generate their maps, so ourSQL
method must return them in binary - 1276: Apply sql_select_limit to first scope only
re: #4391 and #4353 - 1275: Throw error in alter table set default if default fails rule(s)
This PR makesalter column set default
error when its default expression fails the default expression rules. Previously the DDL succeeded and when an insert or any other query was performed the default would cause a continuous error.alter table t alter column col1 set default '{\"bye\":1}' -- errors with: TEXT, BLOB, GEOMETRY, and JSON types may only have expression default values
- 1272: Removed exponential time complexity for foreign key analysis
This fixes dolthub/go-mysql-server#1268
Foreign key analysis created acyclical trees that were traversed during query execution to emulate cascade operations. This meant that cyclical foreign keys were converted to an acyclical tree. Normally this isn't possible as cyclical trees are infinitely traversable, but MySQL has a depth limit of 15, which allowed us to materialize an acyclic tree with a maximum height of 15 nodes. This, however, lead to trees with an exponential number of nodes: roughly(number_of_fks)¹⁵ × 1.5
nodes in the tree. With just 3 foreign keys, we'd get a tree with roughly 22 million nodes, which would take forever to process.
This PR completely changes the analysis step to now generate cyclical trees. In addition, depth checks are now properly implemented (during query execution rather than during analysis), being represented by a returned error once the depth limit has been reached. Interestingly, MySQL is supposed to process up to 15 operations (returning an error on the 16th), but cyclical foreign keys will error on the 15th operation. I think this is a bug in MySQL, but nonetheless the behavior has been duplicated here.
I also updated thetimestamp_test.go
file to grab an unused port. This prevents test failures due to requesting an already-in-use port. Not related to this PR in particular, but it was annoying to deal with so I fixed it.
vitess
- 194: adding missing keywords to token.go
I forgot to add some keywords used inROW_FMT
table option, which caused a bats test to fail - 193: adding support for
PARTITION
syntax for table creation
Parsingtable_options
for statements likeCREATE TABLE (column_definitions) table_options
better matches MySQL
Can parsepartition_options
based off of https://dev.mysql.com/doc/refman/8.0/en/create-table.html
Fix for: #4358 - 192: add full join parsing
- 191: go/mysql: query.go: Put strings into BindVars as copies, not slices of the recyclable wire packet buffers.
Cherry picks vitessio/vitess#5562. - 190: Adding back support for using reserved keywords unquoted in InsertInto
I previously excludedreserved_sql_id
and only includedsql_id
because I didn't think reserved keywords should be allowed as identifiers in insert into columns without being quoted, but it appears we have tests in Dolt that rely on that (e.g.sql-diff.bats calls: INSERT INTO test (pk, int, string, boolean, float, uint, uuid) values ...
) and I figured if we can support those without the quoting, it seems like a nice feature for customers, so I switched back toreserved_sql_id
.
It is confusing that we refer to identifiers as "reserved" keywords when they don't require backtick quoting though and would be nice to tidy that up in our grammar in the future.
Closed Issues
- 4390: POLYGON column not working/decoding with DBeaver
- 4321: Bad error message on
DROP DATABASE
when called on a database that initialized the sql-server - 4319: Can't drop a database with a
-
in it - 4404: Dolt doesn't respect the original order of CHECK, UNIQUE constraints in schemas
- 2673: Reorder of columns in UNIQUE INDEX when re-create table
- 4043: Add
DOLT_DIFF_SUMMARY
table function - 2636:
dolt diff --summary
with cell-wise statistics - 4270: Diff shows rows that have no diffs when a column has been added in the past.
- 4392: Bindvars not filled in for procedure calls
- 4391:
sql_select_limit
should not apply to subqueries - 4361: concurrency problems with dolt_commit()
- 4373:
dolt fetch upstream main
fails on VPS with 1GB RAM - 4220: Merge not showing correct statistics for rows merged
- 4329: Doltpy v2: Lib to can use Dolt from python
- 4328: Publish libs that can use Dolt
- 4340: Feature Request: Generate JSON_TABLE from table column reference
- 525: Column alias in where clause should be an error
- 1268: Analyzing self-referential foreign keys triggering an infinite loop
Latency
Current Default Format (__LD_1__
)
Read Tests | MySQL | Dolt | Multiple |
---|---|---|---|
covering_index_scan | 1.96 | 6.43 | 3.3 |
groupby_scan | 12.52 | 21.89 | 1.7 |
index_join | 1.18 | 16.41 | 13.9 |
index_join_scan | 1.14 | 15.55 | 13.6 |
index_scan | 30.26 | 71.83 | 2.4 |
oltp_point_select | 0.15 | 0.57 | 3.8 |
oltp_read_only | 2.97 | 9.56 | 3.2 |
select_random_points | 0.3 | 1.37 | 4.6 |
select_random_ranges | 0.35 | 1.37 | 3.9 |
table_scan | 30.81 | 68.05 | 2.2 |
types_table_scan | 68.05 | 215.44 | 3.2 |
reads_mean_multiplier | 5.1 |
Write Tests | MySQL | Dolt | Multiple |
---|---|---|---|
bulk_insert | 0.001 | 0.001 | 1.0 |
oltp_delete_insert | 2.76 | 19.65 | 7.1 |
oltp_insert | 1.61 | 7.98 | 5.0 |
oltp_read_write | 5.18 | 36.89 | 7.1 |
oltp_update_index | 1.52 | 9.39 | 6.2 |
oltp_update_non_index | 1.58 | 6.43 | 4.1 |
oltp_write_only | 2.43 | 26.2 | 10.8 |
types_delete_insert | 2.91 | 155.8 | 53.5 |
writes_mean_multiplier | 11.9 |
Overall Mean Multiple | 7.9 |
---|
New Format (__DOLT__
)
Read Tests | MySQL | Dolt | Multiple |
---|---|---|---|
covering_index_scan | 1.93 | 2.76 | 1.4 |
groupby_scan | 12.52 | 17.32 | 1.4 |
index_join | 1.18 | 4.49 | 3.8 |
index_join_scan | 1.14 | 3.82 | 3.4 |
index_scan | 30.26 | 53.85 | 1.8 |
oltp_point_select | 0.15 | 0.47 | 3.1 |
oltp_read_only | 2.97 | 8.43 | 2.8 |
select_random_points | 0.3 | 0.73 | 2.4 |
select_random_ranges | 0.35 | 1.14 | 3.3 |
table_scan | 30.81 | 62.19 | 2.0 |
types_table_scan | 71.83 | 183.21 | 2.6 |
reads_mean_multiplier | 2.5 |
Write Tests | MySQL | Dolt | Multiple |
---|---|---|---|
bulk_insert | 0.001 | 0.001 | 1.0 |
oltp_delete_insert | 3.02 | 9.56 | 3.2 |
oltp_insert | 1.58 | 2.81 | 1.8 |
oltp_read_write | 5.18 | 16.71 | 3.2 |
oltp_update_index | 1.52 | 4.33 | 2.8 |
oltp_update_non_index | 1.47 | 4.65 | 3.2 |
oltp_write_only | 2.3 | 7.98 | 3.5 |
types_delete_insert | 3.07 | 10.84 | 3.5 |
writes_mean_multiplier | 2.8 |
Overall Mean Multiple | 2.6 |
---|