Merged PRs
dolt
- 6434: Fix for adding a unique column
- 6349: Removing old schema merge guard
Before we added support for schema merge conflict reporting, we had a guard in place that would error out for schema changes in merges. This code prevented schema merges that Dolt couldn't handle, but it didn't provide much visibility into the exact schema conflicts. The new schema merge conflict reporting code is able to handle these same cases and provides better visibility into exactly what the schema conflicts are. We've already removed some of the checks from this guard, but there were still a few left in place. This PR completely removes the old schema merge guard code.
The code left in this guard was still protecting against a couple of cases that are now fixed and covered with tests:- Collation changes – we weren't properly handling merging changes to a table's default collation. Fixed and tested now.
- New, non-nullable columns with no default value – new columns that are non-nullable and don't have a default value can't be auto merged since we don't know what value to safely use in the new column for any existing rows, so this now throws an error with instructions on how to fix.
go-mysql-server
- 1914: More Full-Text fixes
More bugs found and fixed forFULLTEXT
indexes. The ordering bug is fairly major, as it could require a drop/readd of the index, so I want to get this out now rather than waiting until I've gotten more tests/fixes in. - 1913: Adjust range heap coster to ensure we don't choose it over a strong lookup join.
This improves the performance for a client query by ensuring that we don't us a HeapRangeJoin when there's a good LookupJoin candidate, but we do use a HeapRangeJoin when there isn't.
In larger queries, the time spent generating and evaluating candidate rows dominates. UsingseqIOCostFactor
feels like a bit of a misnomer, but cpuCostFactor is too low to reflect this behavior, and this is what costInnerJoin uses. - 1912: Normalized handling of ALTER TABLE statements
This PR normalizes the handling of all ALTER TABLE statements to use the same code path. Previously some used convertDDL instead.
Also changes DDL processing to generate multiple plan nodes for statements that need them. Currently tested withadd column .. unique
, but the same approach should also work with auto_increment and other clauses.
Relies on changes in dolthub/vitess#260 - 1909: Show variables panic
- 1908: Some Full-Text fixes
Adds a few additional tests, along with fixes. This is primarily for Sakila compatibility. More tests and fixes are forthcoming, and will arrive in a second PR. - 1906: Fix ErrBinaryCollation in queries with []byte argument
Binding arguments of type[]byte
have been converted to string literals with the default collation (utf8mb4_0900_bin) bytypes.CreateStringWithDefaults()
.
I thinktypes.CreateBinary()
should be used, which generates literals with binary collation. - 1905: When estimating the cardinality of an outer join, bound the cardinality below.
For instance, left outer joins can never have a smaller cardinality than their left child. This should let us get more accurate estimates on the cardinality of outer joins and choose better plans. - 1904: sql/mysql_db: Rework how we store and access table data for users, grants and replica_source_info.
In general, we keep the same structure of storing the data in structs, keyed in multimaps, and having converters for going to and from sql.Rows. We change the following important things:- We more explicitly model a multimap and an indexed set of entities with multiple keyers.
- We add read and write locking around edits and reads of the data table.
- We explicitly do not expose the raw indexed set or multimap from the MySQLDb instance itself. The unprincipled access of the various *Data instances in the old implementation was somewhat problematic.
- 1880: Add "Sliding Range Join" execution plan
(The original PR was accidentally merged. I fixed the history but there doesn't seem to be a way to "unmerge" the PR)
This is the draft implementation of the "Sliding Range Join" execution plan. This allows for more performant joins when the join condition checks that the column on one table is within a range specified by two columns on the other table. - 1871: Full-Text Indexes
This is a partial implementation of Full-Text indexes. There is still quite a bit to finish on the GMS side (as can be seen from the copious amount ofTODO
s), but this shows the broad strokes of how it's implemented, along with most of the "difficult" design choices being implemented. The major choice that has not yet been finalized is how to deal withFTS_DOC_ID
, as it's anAUTO_INCREMENT
column in MySQL, but that would not play well with Dolt merging. I already have ideas on how to handle that (taking into account remotes, etc.), but that would come from a later PR.
https://docs.google.com/document/d/1nGyYg461AhxQjFLzhEEj01XMz0VaTBaBaA44WNu0fc4/edit
Quite a few things have changed from the initial design doc, mostly based on feedback during the meeting, however some of it was post-meeting. There are three tables instead of 1: Config (stores table-specific information shared across all indexes), WordToPos (maps words to an ID and position, not fully used in the default search), and Count (used to calculate relevancy, also not fully used in the default search). I was planning on convertingMATCH ... AGAINST ...
to a join between the tables, which would work when fetching results, butMATCH ... AGAINST ...
may also be used as a result, which necessitated writing all of the functionality anyway, so the join plan was dropped.
Last thing to mention, is that I'm pretty sure that Full-Text indexes actually do a full table scan. It seems weird, but AFAICT the indexes are used to quickly calculate relevancy for each search mode. It seems that, for overly large tables, the search time increases even when other index operations continue to operate nearly instantaneously.
I've tagged two people for review to make it a bit easier. Of course, feel free to take a look at more if you desire.
@reltuk Thesql/fulltext/fulltext.go
file is an expansion of the file you've previously reviewed (all still kept to a single file for now). To complement it and see how it'll be implemented on the Dolt side, you can look atmemory/table.go
. Dolt's table editor will be similar, and the merge paths will only use theFulltextEditor
, which special logic to interface with it from those paths.
@max-hoffman Take a look at the analyzer changes, along with thesql/plan/ddl.go
file. You'll probably need to referencesql/fulltext/fulltext.go
as well. - 1869: Migrate most enginetests for new name resolution
This doubles most of the enginetests to add versions with new name resolution. As a result testing takes ~2x as long, temporarily. Gets majority of those enginetests working with a couple bigger exceptions:- plans projections are subtly different in a way that should be optimized but is probably not priority
- stored procedures need their custom resolution logic ported
- on duplicate update expressions are buggy, going to rewrite those for new format
- skipping one derived table alias test, where we do not have expression memoization or lateral joins to help us execute a resolved plan (related #6407)
- many tests throw "column not found" instead of "table not found" errors. I tried to bookkeep those with Skips, but the skipped suites may accumulate other differences in the meantime.
- I'll need to revert our prepared statement logic before the final switch
- Various validators work a bit differently, might end up skipping some error tests to get the final switch in sooner
Other suites: - TestJSONTableScripts_Experimental -- json_table still broken
- TestRenameColumn_Exp -- error test has different error
A couple other discoveries: - We have to hallucinate unknown table, column, and procedure names in trigger bodies on CREATE, and only fail on execution
- Column default expressions appear to be resolved at execution time
- Alter statements are only resolved at execution time
- The precedence for ASOF in nested views and procedure calls is a bit hairy
- json_array w/ and w/o distinct appears to be untested
- ORDER BY in UNION seems pretty flaky and lightly tested, we resolve names from the left
Dolt PR: #6414
vitess
- 260: Rename MultiAlterDDL to AlterTable and return it for all ALTER TABLE statements
This normalization makes it easier to handle building a plan - 259: Bug fix for round tripping parsed
cast
function calls
When formatting a parsedcast
node back into a SQL string, we were using the formCAST(<arg>, <arg>)
, but that won't roundtrip back to MySQL. It needs to be eitherCONVERT(<arg>, <arg>)
orCAST(<arg> as <arg)
. - 258: Adding new keywords to the non-reserved list so they don't need to be…
… quoted - 257: Support for all temporal query forms in SQL 2011
These are the parser changes necessary to support #6353
In addition to the 3 forms of temporal query in SQL 2011 (AS OF
,FROM .. TO
,BETWEEN
), this PR also implements support for the 2 extensions added by SQLServer,ALL
andCONTAINED IN
.
https://learn.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-ver16
Additionally,VERSION
is now a synonym forSYSTEM_TIME
, a Dolt pseudo-extension. - 256: Support
ANSI_QUOTES
parsing mode
TheANSI_QUOTES
SQL mode changes the behavior of the double quote character. By default in MySQL,ANSI_QUOTES
is not enabled and the double quote character is used to quote string literals. WhenANSI_QUOTES
is enabled, the double quote character may only quote identifiers. TheANSI_QUOTES
mode does not change the behavior for backtick quote chars (they always quote identifiers) or single quote chars (they always quote string literals).
MySQL Reference Docs for ANSI_QUOTES
Related to: #6305 (This is the first step towards supportingANSI_QUOTES
mode in Dolt/GMS)
GMS PR dolthub/go-mysql-server#1896 stubs out the newHandler
interface function, but actually usingANSI_QUOTES
mode in Dolt or GMS won't be possible until a few more changes to Dolt/GMS.
Closed Issues
Latency
Read Tests | MySQL | Dolt | Multiple |
---|---|---|---|
covering_index_scan | 2.07 | 3.02 | 1.5 |
groupby_scan | 13.22 | 17.95 | 1.4 |
index_join | 1.27 | 4.74 | 3.7 |
index_join_scan | 1.21 | 2.3 | 1.9 |
index_scan | 32.53 | 56.84 | 1.7 |
oltp_point_select | 0.14 | 0.48 | 3.4 |
oltp_read_only | 2.71 | 8.28 | 3.1 |
select_random_points | 0.3 | 0.81 | 2.7 |
select_random_ranges | 0.37 | 1.21 | 3.3 |
table_scan | 33.12 | 56.84 | 1.7 |
types_table_scan | 74.46 | 164.45 | 2.2 |
reads_mean_multiplier | 2.4 |
Write Tests | MySQL | Dolt | Multiple |
---|---|---|---|
bulk_insert | 0.001 | 0.001 | 1.0 |
oltp_delete_insert | 4.49 | 5.88 | 1.3 |
oltp_insert | 2.14 | 2.86 | 1.3 |
oltp_read_write | 5.99 | 15.55 | 2.6 |
oltp_update_index | 2.22 | 2.91 | 1.3 |
oltp_update_non_index | 2.3 | 2.91 | 1.3 |
oltp_write_only | 3.25 | 7.56 | 2.3 |
types_delete_insert | 4.57 | 6.79 | 1.5 |
writes_mean_multiplier | 1.6 |
Overall Mean Multiple | 2.1 |
---|