github dolthub/dolt v1.40.1
1.40.1

latest releases: v1.41.0, v1.40.3, v1.40.2...
11 days ago

Previous releases 1.39.5 and 1.40.0 contained a bug when updating floats that would produce incorrect data. The change that caused this bug has been reverted in this release. Releases 1.39.5 and 1.40.0 have been deleted. If you are using those releases, we highly encourage you to upgrade to this release.

Note, only tables containing float types would be effected by the above bug and then only if a value was updated. The effected releases were only in the wild for 48 hours so we think the impact of this bug is small. If you are impacted by the bug, please come by our Discord and we will help further.

The bug was caught by our nightly fuzzer testing.

https://github.com/dolthub/fuzzer

Merged PRs

dolt

  • 8024: Revert "[prolly] filteredIter optimization for exact prefix ranges (#…
    …7966)"
    This reverts commit 6ae4251.
  • 8018: Archive DDict cache and multi-file bug fixes
    Two primary issues addressed in the dolt admin archive command:
    1. Add caching to dictionaries. This improved performance significantly.
    2. Fix multiple bugs related to having multiple table files. That was a gap in testing, so added a bats test for the command.
  • 8001: Feature: Support restore subcommand in dolt_backup()
    The dolt_backup() stored procedure now supports the restore subcommand. Customers can use this support to create a new database from an existing backup, or to sync an existing database from a backup. Note that the restore subcommand currently requires root/superuser access to execute, since it can change database state (particular when the --force argument is used).
    Example usage to create a database named db1 from a backup on disk:
    call dolt_backup('restore', 'file:///opt/local/dolt-backups/db1', 'db1');
    Related to #7993
    Fixes #6074
  • 7999: Generate TEMPORARY TABLE tags the same as normal TABLEs
    This PR fixes this particular collision and makes collisions with other temporary tables more unlikely, probably, by using the deterministic random number generator used for generating tags for normal persisting tables.
    fixes #7995
  • 7990: support auto_increment on temporary tables
    fixes #7972
  • 7988: /.github/scripts/fuzzer/get-fuzzer-job-json.sh: add app label to fuzzer
  • 7966: [prolly] filteredIter optimization for exact prefix ranges
    Index range iteration uses a callback that is arbitrarily flexible but expensive. I changed index table access to only perform partial index scans for complete prefixes, and when the prefix fields equality conditions the generality of the index range callback is overkill. We just need to scan from the partial key (field1, ..., fieldn, nil, ...) to one higher than the partial key (field1, fieldn+1, nil, ...).
    This PR differentiates between RangeField.StrictKey and .Equal attributes to differentiate max-1-row and an equality restriction.
    Still need to do follow-up tracing, but this is in response to the queries from TPC-C below. The string ones are much more common. Each of these use a set of equality filters than only partially completes a secondary index prefix. All of them spend ~5ms of CPU time executing Range.Matches, which is mostly eliminated with this PR.
    SELECT o_entry_d FROM orders1  WHERE o_w_id = 1  AND o_d_id = 5  AND o_c_id = 1891  ORDER BY o_id DESC;
    SELECT c_id  FROM customer1 WHERE c_w_id = 1 AND c_d_id= 6 AND c_last='ABLECALLYABLE' ORDER BY c_first;
    SELECT o_id, o_carrier_id, o_entry_d FROM orders1 WHERE o_w_id = 1 AND o_d_id = 9 AND o_c_id = 1709 ORDER BY o_id DESC
  • 7914: Feature: Binlog replication
    Initial support for Dolt to stream binlog events to a MySQL replica.
    In this initial iteration, binlog events are streamed directly to connected replicas, instead of being written to a log file first. This enables customers to test out the initial binlog replication support, but it means that replicas will only receive the events that happen while they are connected, since they are not persisted in a log file yet. The next iteration will persist binlog events to a log file and will enable replicas to receive events that happened while they were not connected.
    To enable binlog replication, you must persisted the system variables below. Similar to Dolt's other replication formats, the Dolt server must come up with the replication system variables set in order for replication to be enabled. You cannot set these system variables on a running Dolt sql-server to turn on binlog replication – you must persist the values and then restart the sql-server.
    SET @@PERSIST.log_bin=1;
    SET @@PERSIST.enforce_gtid_consistency=ON;
    SET @@PERSIST.gtid_mode=ON;
    Related to #7512
  • 7912: Add IndexedJsonDocument, a JSONWrapper implementation that stores JSON documents in a prolly tree with probabilistic hashing.
    tl;dr: We store a JSON document in a prolly tree, where the leaf nodes of the tree are blob nodes with each contain a fragment of the document, and the intermediate nodes are address map nodes, where the keys describe a JSONPath.
    The new logic for reading and writing JSON documents is cleanly separated into the following files:
    IndexedJsonDocument - The new JSONWrapper implementation. It holds the root hash of the prolly tree.
    JsonChunker - A wrapper around a regular chunker. Used to write new JSON documents or apply edits to existing documents.
    JsonCursor - A wrapper around a regular cursor, with added functionality allowing callers to seek to a specific location in the document.
    JsonScanner - A custom JSON parser that tracks that current JSONPath.
    JsonLocation - A custom representation of a JSON path suitable for use as a prolly tree key.
    Each added file has additional documentation with more details about the individual components.
    Throughout every iteration of this project, the core idea has always been to represent a JSON document as a mapping from JSONPath locations to the values stored at those locations, then we could store that map in a prolly tree and get all the benefits that we currently get from storing tables in prolly trees: fast diffing and merging, fast point lookups and mutations, etc.
    This goal has three major challenges:
    • For deeply nested JSON documents, simply listing every JSONPath requires asymptotically more space than the original document.
    • We need to do this in a way that doesn't compromise performance on simply reading JSON documents from a table, which I understand is the most common use pattern.
    • Ideally, users should not need to migrate their databases, or update their clients in order to read newer dbs, or have to choose between different configurations based on their use case.
      This design achieves all three of these requirements:
    • While it requires additional storage, this additional storage cannot exceed the size of the original document, and is in practice much smaller.
    • It has indistinguishable performance for reading JSON documents from storage, while also allowing asymptotically faster diff and merge operations when the size of the changes is much smaller than the size of the document. (There is a cost: initial inserts of JSON documents are currently around 20% slower, but this is a one-time cost that does not impact subsequent reads and could potentially be optimized further.)
    • Documents written by the new JSONChunker are backwards compatible with current Dolt binaries and can be read back by existing versions of Dolt. (Although they will have different hashes than equivalent documents that those versions would write.)

go-mysql-server

  • 2551: unwrap parenthesized table references
    fixes #8009
  • 2546: Add support for tracking the Aborted_connects status variable
    Adds support for MySQL's Aborted_connects status variable.
    Depends on: dolthub/vitess#351
  • 2542: When casting json to a string, always call StringifyJSON.
    This ensures we match MySQL.
    We previously weren't calling StringifyJSON in ConvertToString because that same method was being used when printing JSON to the screen or a MySQL client, which favored speed over matching MySQL exactly. But for casts we must be precise.
    By adding an extra case to StringType.SQL we can distinguish between these cases and handle them properly.
  • 2541: resolve default values for views
    This was somewhat of a regression caused by dolthub/go-mysql-server#2465.
    However, before that PR views always had NULL as their default values, which did not match MySQL.
    Now, we just resolve the default values in the schema, similar to ResolvedTables.
    fixes #7997
  • 2540: [planbuilder] More update join table name validation
  • 2539: fix UPDATE IGNORE ... JOIN
    fixes: #7986
  • 2534: Implement row alias expressions (INSERT ... VALUES (...) AS new_tbl ON DUPLICATE x = new_tbl.x)
    When inserting values, the user can specify names for both the source table and columns which are used in ON DUPLICATE expressions. It looks like either of the below options:
    INSERT INTO tbl VALUES (1, 2) AS tbl_new ON DUPLICATE KEY b = tbl_new.b;
    INSERT INTO tbl VALUES (1, 2) AS tbl_new(a_new, b_new) ON DUPLICATE KEY b = b_new;
    This replaces the previous (now-deprecated) syntax:
    INSERT INTO tbl VALUES (1, 2) ON DUPLICATE KEY b = VALUES(b);
    Supporting both syntaxes together was non-trivial because it means there's now two different ways to refer to the same column. While he had an existing way to "redirect" one column name to another, this only worked for unqualified names (no table name), and it overrode the normal name resolution rules, which meant we would fail to detect cases that should be seen as ambiguous.
    Previously, we would implement references to the inserted values by using a special table named "__new_ins". I implemented this by keeping that as the default, but using the row alias instead of one was provided. We then create a map from the destination table names to column aliases, and use that map to rewrite expressions that appear inside the VALUES() function.

vitess

  • 353: allow backticks in system and user variables
    This PR allows the use of backticks in system and user variables.
    We are more lenient than MySQL when it comes to backticks in set statements.
    For example, we allow set @abc.def = 10, while MySQL throws an error.
    This is because we treat this as a qualified column identifer and automatically strip the backticks.
    test bump dolthub/go-mysql-server#2548
    fixes #8010
  • 352: Add support for the CONSTRAINT keyword when adding a foreign key without a constraint name
    Customer issue: #8008
  • 351: Add ConnectionAborted() callback to Handler interface
    In order to support the Aborted_connects status variable, GMS needs to be notified when a connection attempt is aborted in the Vitess layer. This change adds a ConnectionAborted() callback method to Vitess' Handler interface and calls it whenever a connection attempt errors out before it's fully established.
    Coordinated with dolthub/go-mysql-server#2546
  • 350: Refactoring BinlogStream type into BinlogMetadata
    The mysql.BinlogStream type from Vitess was a little awkward to use, and seems to have been mostly intended as test code. This gives it a more descriptive name and makes it a little easier to pass around struct copies without concurrency issues from a shared instance.

Closed Issues

  • 8011: Deprecated := assignment syntax in UPDATE queries causes syntax error in Dolt
  • 8009: Parenthesised table references in JOIN clauses cause syntax errors if not followed by nested JOINs
  • 8010: Backtick escaping doesn't work for variables
  • 8008: ADD CONSTRAINT FOREIGN KEY causes syntax error in Dolt
  • 7993: After Dolt CLI restore procedure database is not visible through the SQL client
  • 7638: Syntax Error Occurs When Using AS Clause with ON DUPLICATE KEY UPDATE
  • 6074: Support CALL DOLT_RESTORE() and support a -f force option
  • 7995: Creating temporary tables can cause tag collisions
  • 7997: error: plan is not resolved because of node '*plan.ShowColumns' when executing SHOW FULL COLUMNS or DESCRIBE for specific views
  • 7958: UPDATE ... JOIN fails for tables containing capital letters
  • 7972: Temporary tables don't support AUTO_INCREMENT
  • 7986: UPDATE IGNORE ... JOIN queries fail with "failed to apply rowHandler" error
  • 7973: dolt pull fails in the presence of ignored tables
  • 7961: error reading server preface: http2: frame too large
  • 7957: Dolt returns wrong number of affected rows for UPDATE ... JOIN with clientFoundRows=true
  • 7956: Foreign keys disappear after merge for tables created with FOREIGN_KEY_CHECKS=0
  • 7959: Auto-generated FK names don't match MySQL for renamed tables
  • 7960: Auto-generated index names don't match MySQL for composite keys

Don't miss a new dolt release

NewReleases is sending notifications on new releases.