This release contains backwards incompatible changes:
- The
dolt_docs
system table was updated so that it is created on SQL writes and returns an empty index on reads. Previously reading or writing to thedolt_docs
table in a SQL context would error if no docs existed and users needed to manually create the table. Now reading and writing todolt_docs
will no longer error due to the table not existing, and trying to create adolt_docs
table will error.
Per Dolt’s versioning policy, this is a minor version bump because these changes may impact existing applications. Please reach out to us on GitHub or Discord if you have questions or need help with any of these changes.
Merged PRs
dolt
- 6840: migrate dolt merge-base to use sql queries
This change updates dolt merge-base to use the appropriate sql engine to generate results.
Related: #3922 - 6831: go: sqle: cluster: When performing a graceful transition to standby, take mysql and dolt_branch_control replication state into account.
Graceful transitions to standby block on the primary until a certain number of replicas are trued up. They then return a status of whether each database on each replica is caught up, so that a control plane agent can pick a caught up server to be next primary, for example. - 6827: Automatically create
dolt_docs
table
Currently if you are using a SQL-only interface like Hosted, you have to manually create thedolt_docs
system table to use it. This changes that functionality to be the same as thedolt_ignore
system table, which is created on writes and returns an empty index on reads. The table will always exist as far as the mysql engine is concerned.
Related to #6809 - 6826: Support
AS OF
withdolt_ignore
table.
Fixes #6823
To make this work cleanly, this PR creates a new interface,VersionedTable
, which is implemented by DoltTable and classes that wrap it (LikeWriteableDoltTable
andIgnoreTable
.) - 6822: Add defaults for dolt table row counts
We could do a better job keeping track of row counts for special tables. In the meantime, we want tables to implementsql.StatisticsTable
and return row count values high enough that we use HASH_JOIN instead of INNER_JOIN. - 6794: Prolly stats
Add in-placeANALYZE TABLE
support for Prolly trees:- every index prefix gets a histogram
- level of tree with > 20 chunks (or level 0) is chosen for buckets
- each chunk = a histogram bucket
- full table scan to fill bucket, no sampling or sketches
Adds variety of unit and enginetests guide-rails for statistic values, data format, and serialization to the GMS side.
This is not hooked into the costing logic yet, and there is no automatic refresh lifecycle.
Dolt companion: dolthub/go-mysql-server#2071
- 6787: add
--all
flag for dolt push
This PR adds--all
flag todolt push
command and update some error messages as well as success push messages to be returned.
Also includes:- removing some duplicate tests from
dolt-push.bats
- splitting remote dolt push and pull tests from
remotes.bats
intoremotes-push-pull.bats
- removing some duplicate tests from
go-mysql-server
- 2088: fix off by one for
found_rows
whenlimit > count(*)
We had a bug where we could increment the limit counter before receiving the EOF error.
fixes #6829
companion pr: dolthub/vitess#283 - 2086: Server handling parsed statements
Initially this was going to be a bit more involved, as I was planning on having Dolt expose a new interface, and we'd directly pass in GMS ASTs rather than Vitess ASTs. The Dolt interface approach turned out to be a lot more involved than first anticipated, and the construction of GMS ASTs needs state that we will not have at higher layers, and exposing such state is also a lot more involved. Therefore, I've made a compromise by accepting Vitess ASTs instead, which makes this vastly simpler. It's not going to be quite as powerful, but I think it can still serve our purposes for the foreseeable future.
This basically works by hijacking that fact that we'll sometimes process Vitess ASTs via the prepared cache. If we receive a Vitess AST, then we skip the cache, otherwise we access the cache like the normal workflow. - 2084: Remove Dead Grant/Revoke code
- 2079: reverse filters for reverse lookups
fixes #6824 - 2078: fix panic when order by column is out of range
We used to blindly index ORDER BY values, which resulted in panics.
Now, we throw the appropriate error.
Additionally, this matches MySQL behavior when performing ORDER BY with indexes< 1
.ORDER BY 0
= errorORDER BY -1
= noop
- 2077: sql.StatsTable->RowCount returns whether the estimate is exact
Add return argument for whether the a RowCount can be a substitute for count(*). - 2076: pick float type if one side is non-number type
If one side or comparison is non-number type and the other side is number type, then convert to float to compare and the non-number type value can be float type. - 2075: Fix information_schema row count regression
If a table does not implement RowCount() we used to use 1000 as a default value for row costing. A recent refactor changed that to 0. This fixes information schema tables to report the 1000 value again, which is usually accurate for small databases because of default tables and columns. I also fixed some issues with database reporting for info schema tables.
This regression probably still exists for some dolt tables and table functions. I will do a pass and see if I can add some more accurate values on the Dolt side. - 2074: optimize
min(pk)
andmax(pk)
This PR adds an optimization to queries that have aMIN
orMAX
aggregation over aPRIMARY KEY
column.
Since indexes are already sorted (and we support a reverse iterator) we can look at the first/last row to answer queries in this form.
The new analyzer rule,replaceAgg
, converts queries of the formatselect max(pk) ... from ...
to the equivalentselect pk ... from ... order by pk limit 1
. Then, we depend on anreplacePkSort
to applyIndexedTableAccess
Additionally, this PR hasreplacePkSort
optimization apply to queries that have filters (specifically those that were pushed down to IndexedTableAccess)
There is also a some refactoring and tidying up.
Fixes #6793 - 2071: Updates for Dolt stats
json_value
andjson_length
addedjson_table
edited to support json document inputs.- our custom json marshaller supports types that implement the
json.Marshaller
interface - increased recursive iter limit to 10,000 to more easily generate 3-level prolly trees for statistics testing
Note: thejson_value
notation is different than mysql's. I accept the type as a third parameter, rather than expecting a RETURNING clause.
- 2067: adding sqllogictests
This PR adds some utility scripts to convert CRDB testing files into SQLLogicTest format.
Additionally, this adds tests focusing on join and subqueries.
Some notable tests are added as skipped enginetests. - 1888: Remove filters from LookupJoins when they're provably not required.
During joins, we still evaluate every filter for every candidate row. But based on the join implementation, some of those filters must necessarily be true, so we don't need to evaluate them.
In most joins the performance cost of this isn't that bad, but this problem is most noticeable in LookupJoins where a point lookup is constructed from several columns, at which point the filter evaluation can dominate the runtime.
vitess
- 283: allow query options to appear in any order any number of times
Allow statements like this to parse:Fixes #6829select distinct sql_calc_found_rows distinct * from t;
Companion PR: dolthub/go-mysql-server#2088 - 282: Server handling parsed statements
See dolthub/go-mysql-server#2086 - 281: Made generated column expressions parse to ParenExpr to match Default
- 280: Update the default server version to
8.0.33
The connection handshake was advertising a server version of5.7.9-Vitess
and some clients were using that info and trying and speak MySQL-5.7 to Dolt (example issue)
This change updates the default advertised server version to8.0.33-Dolt
.
Dolt CI tests are running at: #6798
Closed Issues
- 6829: SQL_CALC_FOUND_ROWS doesn't work?
- 6824: Reverse indexed table walks generate incorrect ordering when using multiple filters ranges.
- 6460: Push/Pull/Fetch while sql-server is running
- 6406:
FOUND_ROWS()
returns incorrect results - 6823: Panic using
AS OF
withdolt_ignore
system table - 6819: JSON numbers less than 0.5 is treated as 0
- 6626: Implement
dolt push --all
Latency
Read Tests | MySQL | Dolt | Multiple |
---|---|---|---|
covering_index_scan | 2.07 | 2.71 | 1.3 |
groupby_scan | 12.98 | 17.32 | 1.3 |
index_join | 1.32 | 4.49 | 3.4 |
index_join_scan | 1.25 | 2.14 | 1.7 |
index_scan | 33.72 | 55.82 | 1.7 |
oltp_point_select | 0.17 | 0.4 | 2.4 |
oltp_read_only | 3.25 | 7.17 | 2.2 |
select_random_points | 0.32 | 0.67 | 2.1 |
select_random_ranges | 0.38 | 0.92 | 2.4 |
table_scan | 33.72 | 55.82 | 1.7 |
types_table_scan | 74.46 | 158.63 | 2.1 |
reads_mean_multiplier | 2.0 |
Write Tests | MySQL | Dolt | Multiple |
---|---|---|---|
bulk_insert | 0.001 | 0.001 | 1.0 |
oltp_delete_insert | 4.57 | 5.37 | 1.2 |
oltp_insert | 2.22 | 2.66 | 1.2 |
oltp_read_write | 6.67 | 13.46 | 2.0 |
oltp_update_index | 2.39 | 2.66 | 1.1 |
oltp_update_non_index | 2.3 | 2.61 | 1.1 |
oltp_write_only | 3.3 | 6.67 | 2.0 |
types_delete_insert | 4.82 | 5.77 | 1.2 |
writes_mean_multiplier | 1.4 |
Overall Mean Multiple | 1.7 |
---|