Merged PRs
dolt
- 9204: feat(import,csv,psv): Add support for importing CSV and PSV files without header rows
Summary
- Add --no-header flag to treat the first row in CSV/PSV files as data instead of column names
- Add --columns option to specify column names when importing files without headers
- Fix nil pointer panic when importing from stdin with --create-table
In short, this feature makes Dolt more compatible with MySQL/SQLite workflows and provides users with more flexibility when importing data.
Problem
Previously, Dolt always expected the first row of CSV/PSV files to contain column names. This differs from MySQL and SQLite which support importing files where the first row contains data. Users migrating from these systems or working with headerless data files couldn't import them without modifying their files.
Additionally, when users attempted to import data from stdin using --create-table, they would encounter a nil pointer panic instead of receiving a error message.Solution
The implementation adds:- A new --no-header flag that treats the first row as data instead of column headers
- A complementary --columns option to specify column names when headers aren't present
- Proper validation to ensure correct flag combinations
- Comprehensive error handling for stdin imports with clear error messages
- Integration tests for both CSV and PSV files
Testing
- Added integration tests for both CSV and PSV files that verify:
- Importing files with --no-header and --columns options
- Error cases when required options are missing
- Original behavior is maintained when not using --no-header
- Behavior of --columns with and without --no-header
- Edge cases like stdin imports
go-mysql-server
- 2981: Unwrap wrapper values used in JSON aggregation functions, and un-skip accidentally-skipped tests for this behavior.
We were accidentally skipping most of the tests inTestJsonScripts
. An error in the test harness meant that skipping one test in this suite would also skip all additional tests.
A few of the skipped tests were for JSON aggregation functions. The recent "Adaptive Encoding / Wrapper Values" optimization wasn't working properly with these functions because the wrapped values provided to these functions weren't being unwrapped before being inserted into JSON documents. These tests would have caught that issue, but didn't because they were disabled.
This PR fixes the issue and also re-enables the test. - 2979: fix indexing for
GROUP BY
s andWINDOW
s inINSERT
andREPLACE
statements inTRIGGERS
Using aggregation and window functions inside a select statement inside an insert source inside a trigger was causing problems. For example, a trigger defined like so:
The issue involved thecreate trigger trig before insert on t1 for each row begin insert into t2 select max(id), first_value(id) over (partition by id order by id), ... from t3; end;
Projections
over theGroup By
s. The scope for thegroup by
s already contained the trigger's columns and are indexed uniquely, so we shouldn't include the trigger/parent scope.
Closed Issues
- 9222: JSON_OBJECT error on
longtext
columns - "unsupported type: *val.TextStorage" - 7831: Allow importing CSVs without column names.
Performance
Read Tests | MySQL | Dolt | Multiple |
---|---|---|---|
covering_index_scan | 2.0 | 0.65 | 0.32 |
groupby_scan | 13.46 | 17.95 | 1.33 |
index_join | 1.47 | 2.39 | 1.63 |
index_join_scan | 1.42 | 1.5 | 1.06 |
index_scan | 34.33 | 30.26 | 0.88 |
oltp_point_select | 0.18 | 0.26 | 1.44 |
oltp_read_only | 3.43 | 5.28 | 1.54 |
select_random_points | 0.33 | 0.59 | 1.79 |
select_random_ranges | 0.37 | 0.61 | 1.65 |
table_scan | 34.33 | 32.53 | 0.95 |
types_table_scan | 75.82 | 125.52 | 1.66 |
reads_mean_multiplier | 1.3 |
Write Tests | MySQL | Dolt | Multiple |
---|---|---|---|
oltp_delete_insert | 8.9 | 6.32 | 0.71 |
oltp_insert | 4.1 | 3.07 | 0.75 |
oltp_read_write | 8.74 | 11.45 | 1.31 |
oltp_update_index | 4.18 | 3.19 | 0.76 |
oltp_update_non_index | 4.18 | 3.07 | 0.73 |
oltp_write_only | 5.67 | 6.32 | 1.11 |
types_delete_insert | 8.28 | 6.67 | 0.81 |
writes_mean_multiplier | 0.88 |
TPC-C TPS Tests | MySQL | Dolt | Multiple |
---|---|---|---|
tpcc-scale-factor-1 | 97.61 | 39.19 | 2.49 |
tpcc_tps_multiplier | 2.49 |
Overall Mean Multiple | 1.56 |
---|