github dolthub/dolt v1.8.8
1.8.8

latest releases: v1.43.13, v1.43.12, v1.43.11...
15 months ago

Merged PRs

dolt

  • 6434: Fix for adding a unique column
  • 6349: Removing old schema merge guard
    Before we added support for schema merge conflict reporting, we had a guard in place that would error out for schema changes in merges. This code prevented schema merges that Dolt couldn't handle, but it didn't provide much visibility into the exact schema conflicts. The new schema merge conflict reporting code is able to handle these same cases and provides better visibility into exactly what the schema conflicts are. We've already removed some of the checks from this guard, but there were still a few left in place. This PR completely removes the old schema merge guard code.
    The code left in this guard was still protecting against a couple of cases that are now fixed and covered with tests:
    • Collation changes – we weren't properly handling merging changes to a table's default collation. Fixed and tested now.
    • New, non-nullable columns with no default value – new columns that are non-nullable and don't have a default value can't be auto merged since we don't know what value to safely use in the new column for any existing rows, so this now throws an error with instructions on how to fix.

go-mysql-server

  • 1914: More Full-Text fixes
    More bugs found and fixed for FULLTEXT indexes. The ordering bug is fairly major, as it could require a drop/readd of the index, so I want to get this out now rather than waiting until I've gotten more tests/fixes in.
  • 1913: Adjust range heap coster to ensure we don't choose it over a strong lookup join.
    This improves the performance for a client query by ensuring that we don't us a HeapRangeJoin when there's a good LookupJoin candidate, but we do use a HeapRangeJoin when there isn't.
    In larger queries, the time spent generating and evaluating candidate rows dominates. Using seqIOCostFactor feels like a bit of a misnomer, but cpuCostFactor is too low to reflect this behavior, and this is what costInnerJoin uses.
  • 1912: Normalized handling of ALTER TABLE statements
    This PR normalizes the handling of all ALTER TABLE statements to use the same code path. Previously some used convertDDL instead.
    Also changes DDL processing to generate multiple plan nodes for statements that need them. Currently tested with add column .. unique, but the same approach should also work with auto_increment and other clauses.
    Relies on changes in dolthub/vitess#260
  • 1909: Show variables panic
  • 1908: Some Full-Text fixes
    Adds a few additional tests, along with fixes. This is primarily for Sakila compatibility. More tests and fixes are forthcoming, and will arrive in a second PR.
  • 1906: Fix ErrBinaryCollation in queries with []byte argument
    Binding arguments of type []byte have been converted to string literals with the default collation (utf8mb4_0900_bin) by types.CreateStringWithDefaults().
    I think types.CreateBinary() should be used, which generates literals with binary collation.
  • 1905: When estimating the cardinality of an outer join, bound the cardinality below.
    For instance, left outer joins can never have a smaller cardinality than their left child. This should let us get more accurate estimates on the cardinality of outer joins and choose better plans.
  • 1904: sql/mysql_db: Rework how we store and access table data for users, grants and replica_source_info.
    In general, we keep the same structure of storing the data in structs, keyed in multimaps, and having converters for going to and from sql.Rows. We change the following important things:
    1. We more explicitly model a multimap and an indexed set of entities with multiple keyers.
    2. We add read and write locking around edits and reads of the data table.
    3. We explicitly do not expose the raw indexed set or multimap from the MySQLDb instance itself. The unprincipled access of the various *Data instances in the old implementation was somewhat problematic.
  • 1880: Add "Sliding Range Join" execution plan
    (The original PR was accidentally merged. I fixed the history but there doesn't seem to be a way to "unmerge" the PR)
    This is the draft implementation of the "Sliding Range Join" execution plan. This allows for more performant joins when the join condition checks that the column on one table is within a range specified by two columns on the other table.
  • 1871: Full-Text Indexes
    This is a partial implementation of Full-Text indexes. There is still quite a bit to finish on the GMS side (as can be seen from the copious amount of TODOs), but this shows the broad strokes of how it's implemented, along with most of the "difficult" design choices being implemented. The major choice that has not yet been finalized is how to deal with FTS_DOC_ID, as it's an AUTO_INCREMENT column in MySQL, but that would not play well with Dolt merging. I already have ideas on how to handle that (taking into account remotes, etc.), but that would come from a later PR.
    https://docs.google.com/document/d/1nGyYg461AhxQjFLzhEEj01XMz0VaTBaBaA44WNu0fc4/edit
    Quite a few things have changed from the initial design doc, mostly based on feedback during the meeting, however some of it was post-meeting. There are three tables instead of 1: Config (stores table-specific information shared across all indexes), WordToPos (maps words to an ID and position, not fully used in the default search), and Count (used to calculate relevancy, also not fully used in the default search). I was planning on converting MATCH ... AGAINST ... to a join between the tables, which would work when fetching results, but MATCH ... AGAINST ... may also be used as a result, which necessitated writing all of the functionality anyway, so the join plan was dropped.
    Last thing to mention, is that I'm pretty sure that Full-Text indexes actually do a full table scan. It seems weird, but AFAICT the indexes are used to quickly calculate relevancy for each search mode. It seems that, for overly large tables, the search time increases even when other index operations continue to operate nearly instantaneously.
    I've tagged two people for review to make it a bit easier. Of course, feel free to take a look at more if you desire.
    @reltuk The sql/fulltext/fulltext.go file is an expansion of the file you've previously reviewed (all still kept to a single file for now). To complement it and see how it'll be implemented on the Dolt side, you can look at memory/table.go. Dolt's table editor will be similar, and the merge paths will only use the FulltextEditor, which special logic to interface with it from those paths.
    @max-hoffman Take a look at the analyzer changes, along with the sql/plan/ddl.go file. You'll probably need to reference sql/fulltext/fulltext.go as well.
  • 1869: Migrate most enginetests for new name resolution
    This doubles most of the enginetests to add versions with new name resolution. As a result testing takes ~2x as long, temporarily. Gets majority of those enginetests working with a couple bigger exceptions:
    • plans projections are subtly different in a way that should be optimized but is probably not priority
    • stored procedures need their custom resolution logic ported
    • on duplicate update expressions are buggy, going to rewrite those for new format
    • skipping one derived table alias test, where we do not have expression memoization or lateral joins to help us execute a resolved plan (related #6407)
    • many tests throw "column not found" instead of "table not found" errors. I tried to bookkeep those with Skips, but the skipped suites may accumulate other differences in the meantime.
    • I'll need to revert our prepared statement logic before the final switch
    • Various validators work a bit differently, might end up skipping some error tests to get the final switch in sooner
      Other suites:
    • TestJSONTableScripts_Experimental -- json_table still broken
    • TestRenameColumn_Exp -- error test has different error
      A couple other discoveries:
    • We have to hallucinate unknown table, column, and procedure names in trigger bodies on CREATE, and only fail on execution
    • Column default expressions appear to be resolved at execution time
    • Alter statements are only resolved at execution time
    • The precedence for ASOF in nested views and procedure calls is a bit hairy
    • json_array w/ and w/o distinct appears to be untested
    • ORDER BY in UNION seems pretty flaky and lightly tested, we resolve names from the left
      Dolt PR: #6414

vitess

  • 260: Rename MultiAlterDDL to AlterTable and return it for all ALTER TABLE statements
    This normalization makes it easier to handle building a plan
  • 259: Bug fix for round tripping parsed cast function calls
    When formatting a parsed cast node back into a SQL string, we were using the form CAST(<arg>, <arg>), but that won't roundtrip back to MySQL. It needs to be either CONVERT(<arg>, <arg>) or CAST(<arg> as <arg).
  • 258: Adding new keywords to the non-reserved list so they don't need to be…
    … quoted
  • 257: Support for all temporal query forms in SQL 2011
    These are the parser changes necessary to support #6353
    In addition to the 3 forms of temporal query in SQL 2011 (AS OF, FROM .. TO, BETWEEN), this PR also implements support for the 2 extensions added by SQLServer, ALL and CONTAINED IN.
    https://learn.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-ver16
    Additionally, VERSION is now a synonym for SYSTEM_TIME, a Dolt pseudo-extension.
  • 256: Support ANSI_QUOTES parsing mode
    The ANSI_QUOTES SQL mode changes the behavior of the double quote character. By default in MySQL, ANSI_QUOTES is not enabled and the double quote character is used to quote string literals. When ANSI_QUOTES is enabled, the double quote character may only quote identifiers. The ANSI_QUOTES mode does not change the behavior for backtick quote chars (they always quote identifiers) or single quote chars (they always quote string literals).
    MySQL Reference Docs for ANSI_QUOTES
    Related to: #6305 (This is the first step towards supporting ANSI_QUOTES mode in Dolt/GMS)
    GMS PR dolthub/go-mysql-server#1896 stubs out the new Handler interface function, but actually using ANSI_QUOTES mode in Dolt or GMS won't be possible until a few more changes to Dolt/GMS.

Closed Issues

Latency

Read Tests MySQL Dolt Multiple
covering_index_scan 2.07 3.02 1.5
groupby_scan 13.22 17.95 1.4
index_join 1.27 4.74 3.7
index_join_scan 1.21 2.3 1.9
index_scan 32.53 56.84 1.7
oltp_point_select 0.14 0.48 3.4
oltp_read_only 2.71 8.28 3.1
select_random_points 0.3 0.81 2.7
select_random_ranges 0.37 1.21 3.3
table_scan 33.12 56.84 1.7
types_table_scan 74.46 164.45 2.2
reads_mean_multiplier 2.4
Write Tests MySQL Dolt Multiple
bulk_insert 0.001 0.001 1.0
oltp_delete_insert 4.49 5.88 1.3
oltp_insert 2.14 2.86 1.3
oltp_read_write 5.99 15.55 2.6
oltp_update_index 2.22 2.91 1.3
oltp_update_non_index 2.3 2.91 1.3
oltp_write_only 3.25 7.56 2.3
types_delete_insert 4.57 6.79 1.5
writes_mean_multiplier 1.6
Overall Mean Multiple 2.1

Don't miss a new dolt release

NewReleases is sending notifications on new releases.