This is a bug fix release for various issues discovered after we released 0.10.1. There are no new features, just bug fixes. Database files created by DuckDB v0.10.* or v0.9.* can be read by DuckDB v0.10.2.
SQL Modifications
This release has a number of bug fixes that change SQL semantics in a few edge cases:
- Nested Boolean Comparisons now have consistent NULL comparison semantics - #11496
- Structs with non-matching keys require explicit casts when compared or combined - #11396
What's Changed
- Bump julia version & fix release script for sub-versions > 9 by @Mytherin in #11225
- Flatten Rewrite by @maiadegraaf in #11223
- ORDER BY ColumnNumber with Collations by @tiagokepe in #11139
- Fix differences to implementation for to_parquet, write_parquet, to_csv, write_csv, Expression.alias, DuckDBPyRelation.map by @binste in #11135
- Issue template: Ask for MWEs by @szarnyasg in #11192
- Cleaning up FSST: Remove unused AVX512 code by @hannes in #11222
- Fix #11211 - correctly fill in string_t padding for bit type by @Mytherin in #11231
- Fix #3391: Stop creating background threads if the thread constructor throws an exception by @Mytherin in #11236
- R_CMD_CHECK: Pin to duckdb/duckdb-r 0ed106a71c by @carlopi in #11245
- Add support for HEX(BLOB) by @Mytherin in #11243
- Remove no_vector_verification in Map Subscript Test by @maiadegraaf in #11242
- Update logos in README by @szarnyasg in #11256
- Ignore user defined parameters that change names or types of csv columns in sniffer's prompt. by @pdet in #11257
- [Python] Fix error caused by looking up a TypeCatalogEntry without an active transaction. by @Tishj in #11255
- [Fix] Fuzzer issue in list_select by @taniabogatsch in #11248
- [Parquet] Support for LZ4 Compression by @hannes in #11220
- Fix #11254: Add missing includes to terminal by @Mytherin in #11265
- Issue #10867: AsOf Predicate Pushdown by @hawkfish in #11233
- Fix plan cost runner regression script by @Tmonster in #11129
- Check if we need to throw any remaining errors at end of CSV scanning by @pdet in #11276
- Allow duplicate names in json objects when ignore_errors is true by @lnkuiper in #11271
- Do not surround JSON with quotes in sqlite shell output by @lnkuiper in #11268
- add TRIM support to virtual filesystem, and implementation on linux by @jkub in #11258
- Perform direct write operation if input data are larger than buffer size by @quentingodeau in #11203
- Fuzzer fixes by @lnkuiper in #11286
- Compile spatial also for rtools by @carlopi in #11291
- allow injecting custom BufferManager implementation by @jkub in #11215
- Default to RECORDS in JSON reader if more than one column is specified by @lnkuiper in #11295
- Add support for materialized CTEs in INSERT/UPDATE/DELETE statements by @kryonix in #10878
- Only throw exception if
je_mallctl
fails in DEBUG mode by @lnkuiper in #11303 - Fixing casting issue in generators by @hannes in #11304
- Rework
FileSystem::OpenFile
call, and addFILE_FLAGS_NULL_IF_NOT_EXISTS
by @Mytherin in #11297 - Fix potential UB when
list()
aggregate is used in combination with other arena using aggregate functions by @Maxxen in #11306 - Fix #11293 - for ARRAY([subquery]) explicitly push the ORDER BY of the underlying subquery into the array aggregate by @Mytherin in #11316
- Fix #11281: explicitly select column types of information_schema tables for all columns, even if they are always NULL by @Mytherin in #11317
- Fixup py upload by @carlopi in #11308
- Issue #11279: TIMESTAMP => TIMESTAMPTZ by @hawkfish in #11320
- Fix null pointer exception when rolling back updates if the rollback was caused by an OOM by @Mytherin in #11309
- Fix #11283 - report consistent foreign key constraint name in information_schema by @Mytherin in #11318
- Fix #11294 - avoid applying Filter Pushdown optimization for UNION/EXCEPT without ALL by @Mytherin in #11315
- Fix #10695 - handle ? prepared statement parameters correctly for POSITION(x IN y) by @Mytherin in #11314
- Windows CLI - emit UTF8 directly using SetConsoleOutputCP(CP_UTF8) if possible by @Mytherin in #11324
- Fix #11319: use modulo when computing day of the week in excel extension by @Mytherin in #11328
- [CI] Fix bash syntax in TwineUpload by @carlopi in #11333
- Fix #11284: avoid adding the same column multiple times to a primary key/unique constraint name list by @Mytherin in #11325
- In ColumnData, limit scan to the current count in the column by @Mytherin in #11329
- Issue #11269: DISTINCT Sorted Aggregates by @hawkfish in #11321
- [Attach] Fix bug causing sequences to break attaching databases. by @Tishj in #11327
- Flatten hash vector before combining list hashes by @lnkuiper in #11340
- Make sniffer more consistent when nullpadding/ignore_errors are on by @pdet in #11313
- fix(arrow): union buffer count & handle schema errors by @Mause in #11326
- fix duckdb-r script by @Tmonster in #11345
- Fix regression_test_runner.py by @carlopi in #11346
- Issue #11234: IEJoin Scan Reset by @hawkfish in #11347
- TPC-H: Use BIGINT for ID fields schema where required by the specification by @szarnyasg in #11341
- Another round of polishing staged releases by @carlopi in #11342
- CI: Remove issue labeling workflow by @szarnyasg in #11355
- RE2 upgrade to version 2023-02-01 by @hannes in #11252
- File System: Add
optional_ptr<FileOpener>
to various calls, and add support for attaching DuckDB files over S3 by @Mytherin in #11343 - README: Display different logo for light/dark mode by @szarnyasg in #11366
- Fix bug in duckdb_bind_blob by @pfarndt in #11368
- Fix OSX CI by @samansmink in #11379
- Enable clang-tidy on headers and fix all headers to conform to our clang-tidy rules by @Mytherin in #11376
- Add logical_type to parameters of format_pg_type by @Flogex in #11369
- Issue #10965: RESPECT IGNORE NULLS by @hawkfish in #11372
- Fix building issues in WIN32, remove unnecessary modification. by @kindred77 in #11356
- Zero-initialize aggregate states with destructors immediately after allocating by @lnkuiper in #11360
- Update README.md by @jingshi-ant in #11357
- Issue #10885: Negative Window RANGEs by @hawkfish in #11390
- Issue #11377: Invertible TIMESTAMP_XXX Casts by @hawkfish in #11392
- Update init.py To export "extract_statements" function by @oomojola in #11394
- Internal #1657: Stricter STRUCT Casts by @hawkfish in #11396
- allow set readonly on attached db by @stephaniewang526 in #11397
- Give preference to FSSPEC defined FS by @pdet in #11400
- Default serialize
optional_idx
, addskip_default
option tojson_serialize_sql()
by @Maxxen in #11405 - CI: Also label PRs as 'stale' and close them when there's no activity by @szarnyasg in #11420
- fix(jdbc): 1-index getBytes() by @Mause in #11421
- Remove redundant default descriptions by @szarnyasg in #11415
- clang-tidy: enable
cppcoreguidelines-pro-type-const-cast
by @Mytherin in #11414 - clang-tidy: enable
cppcoreguidelines-avoid-non-const-global-variables
by @Mytherin in #11424 - Issue #11419: Quantile Order By by @hawkfish in #11428
- [CSV Sniffer] Give preference to quoted candidates by @pdet in #11418
- clang-tidy: enable
cppcoreguidelines-virtual-class-destructor
by @Mytherin in #11437 - clang-tidy: enable
cppcoreguidelines-[interfaces-global-init|slicing|rvalue-reference-param-not-moved]
by @Mytherin in #11435 - Fix #11393 - improve error message when trying to use a lateral join column in a table function that does not support it by @Mytherin in #11436
- add readonly to duckdb_databases() by @stephaniewang526 in #11429
- Fix missing opener propagation by @quentingodeau in #11454
- Fix #11246: Use SetConsoleCP function to set input to UTF8 when reading by @Mytherin in #11452
- CLI: Add support for ".edit" or "\e" by @Mytherin in #11447
- Fix VS2022 Preview ClangCl build by @bodand in #11456
- Remove an unnecessary line from bind_insert.cpp by @huachaohuang in #11443
- [CI] Skip ccache for R.yml by @carlopi in #11459
- Improve binding of CTEs by @kryonix in #11399
- Move BindCreateIndex from Catalog to Binder by @philippmd in #11402
- [Substrait-ADBC] Fix for substrait plan execution via ADBC by @pdet in #11358
- Removing abort() from RE2 again because Google refuses to use exceptions by @hannes in #11458
- Defer allocation in read_json by @lnkuiper in #11378
- [ODBC] Add escape character to ParseStringFilter to support Power Query ('table_name' is escaped to 'table_name') by @guenp in #11432
- Bump to post-portfile change for duckdb_azure by @carlopi in #11476
- Reduce memory usage of DELETE operations by @Mytherin in #11470
- Use
optional_idx
in more places by @Mytherin in #11466 - Revert "Move BindCreateIndex from Catalog to Binder" by @Mytherin in #11478
- [Arrow] Throw on invalid STRUCT type by @Tishj in #11464
- [Dev] Do not use CatalogEntry references inside Dependency objects. by @Tishj in #11408
- Fix extension builds by @carlopi in #11486
- [Fix] Throw BinderException for UNNEST expressions in WINDOW expressions by @taniabogatsch in #11247
- Check for IUTF8 flag defined before setting it by @patmaddox in #11488
- Fix #11445: correctly detect recursive aliases when using struct unnest by @Mytherin in #11497
- Fix #11444: avoid using recursion in string -> list parsing by @Mytherin in #11498
- Add serialization for
LogicalCopyDatabase
operator by @Flogex in #11401 - add support in Julia appender for missing and nothing values by @rdavis120 in #11508
- [Python] Produce
datetime.time
values when converting TIME columns to Pandas DataFrame by @Tishj in #11468 - [Fix][ADBC] Implement required ADBCConnectionGetObjects schema by @joellubi in #11446
- Add support for decimal modulo operation by @Mytherin in #11506
- Move
CompressedMaterialization
inside ofStatisticsPropagator
by @lnkuiper in #11495 - Bump stale bot version by @szarnyasg in #11509
- Rework issue workflow by @Mytherin in #11522
- [RE2] Add includes and remove potential throw from destructor by @carlopi in #11513
- Issue #11292: Nested Boolean Compares by @hawkfish in #11496
- [Dev] Initialize new buffers with garbage data if
DESTROY_UNPINNED_BLOCKS
is set by @Tishj in #11270 - Fix timeout in async workflow by @samansmink in #11525
- Move assertion in
json_scan.cpp
by @lnkuiper in #11530 - Issue #11518: TryParseTime by @hawkfish in #11519
- Fuzzer Bugfixes by @Maxxen in #11544
- [CI] Fix CI failure on
C Enum Integrity Check
by @Tishj in #11547 - [ICU] Use the correct lookup precedence for TimeZone settings by @Tishj in #11546
- [CI] Move from default GITHUB_TOKEN to specific one by @carlopi in #11556
- [CI] Fix Deploy step to execute only for duckdb organization by @carlopi in #11553
- Rework
RadixPartitionHashTable
task assignment in source phase by @lnkuiper in #11528 - Run new micro benchmarks in CI when they are added by @Tmonster in #11532
- Rework
vector_hash
for ARRAYs by @Maxxen in #11558 - [Dev] Add assertions around Uncompressed String storage by @Tishj in #11267
- python: Add missing global options to write_csv by @jzavala-gonzalez in #10382
- [Python] Fix issue with lists containing dictionaries of different sizes by @Tishj in #11095
- [Dev][Python] Add nightly test to execute all sqllogic tests using the python package by @Tishj in #11137
- Parquet Writer: Early out creating dictionary by @lnkuiper in #11461
- ODBC driver should ignore "driver" and "trusted_connection" keywords in connection string by @guenp in #11382
- [ODBC] Fix: Support loading UTF-8 encoded data with Power BI by @guenp in #11423
- Draft permissions - bot does not have permission for drafting by @Mytherin in #11575
- CI: Remove 'needs reproducible example' when 'reproduced' label is applied by @szarnyasg in #11576
- Various fixes & clean-up around STRUCT UNNEST by @Mytherin in #11580
- Update token by @Mytherin in #11592
- Update issue template by @szarnyasg in #11577
- [CI] Remove GITHUB_PAT variable from R-CMD-check by @carlopi in #11593
- Respect read-only mode in dbgen and dsdgen by @Mytherin in #11585
- Bump-back duckdb_azure to pre-lzma custom vcpkg-port by @carlopi in #11595
- Correctly handle database names with quotes in USE statement by @Mytherin in #11587
- Bump postgres version and build arrow also for windows by @carlopi in #11604
- Support reading gzipped files in the test runner by @chrisiou in #11600
- initializes unknown indexes on catalog lookup by @Maxxen in #11551
- Fix Progress Bar for many large CSV Files + Adjustment to not store buffers from compressed files over single threaded scans by @pdet in #11273
- CSV Rejects Tables 2.0 by @pdet in #11512
- Fix topn placement by @Tmonster in #11601
- Fix various issues found by oss-fuzz by @Mytherin in #11613
- [ODBC] Fix: timestamps and times are parsed as dates by Power Query by @guenp in #11610
- Fix various fuzzer issues, move fuzzer scripts into this repo, and expand
reduce_sql_statement
to improve test case reduction capabilities of fuzzer by @Mytherin in #11622 - [Dev] Make the
extension_entries.hpp
generation script more modular by @Tishj in #11623 - [Fix][ADBC] Don't filter system catalogs/schemas in ConnectionGetObjects by @joellubi in #11618
- Add pyodide wheel building github action by @cpcloud in #11531
- Move away from dynamic_cast to Cast<> infrastructure by @carlopi in #11619
- Extension Metadata by @carlopi in #11515
- [Dev] Regenerate query string for
IndexCatalogEntry
. by @Tishj in #11462 - Upload pyodide by @carlopi in #11626
- Add docker alpine build to check on builds by @carlopi in #11490
- Add Vector Similarity Search (VSS) Extension by @Maxxen in #11614
- Metadata fix by @carlopi in #11629
- Fix extension config for arrow, remove patch from sqlite by @carlopi in #11628
- [CSV Reader] Resets the buffer manager over recursive scans by @pdet in #11631
- Make path to append_metadata.cmake relative to top-level CMakeLists.txt by @Flogex in #11635
- [CSV Reader] Fixes an issue with conflicting strategies for buffer cleaning by @pdet in #11630
- Fix more issues found by the fuzzer, extend SQL reduction further by @Mytherin in #11642
- fix(jdbc): support non-string parameter types by @Mause in #11646
- Few more fuzzer fixes by @Mytherin in #11648
- Bump spatial by @Maxxen in #11650
- Avoid performing Apple codesign on extensions by @carlopi in #11652
- Filter out single relation predicates before join ordering by @wangxiaoying in #11645
- Fix
last_value
in theduckdb_sequences
metadata function by @Tishj in #11465 - Limit batch insert threads based on available memory, similar to Parquet write by @Mytherin in #11655
- [Vacuum] Fix serialization and Copy of the VacuumStatement by @Tishj in #11656
- More index initialization by @Maxxen in #11659
- Skip tests with the unzip keyword in python and disable unzip.test for 32bit systems by @chrisiou in #11658
- Bump extension versions, remove patches by @carlopi in #11662
- Accept a list of multiple nullstring values for CSV Files by @pdet in #11616
- Include falloc to fix build on some Linux systems by @zmbc in #11663
- Fix #11469 - make unnest parameters case-insensitive by @Mytherin in #11667
- Fix #11467: correctly merge unnamed structs and structs in CombineEqualTypes by @Mytherin in #11668
- Skip ADBC tests if python version is not 3.9 or higher by @pdet in #11653
- Fix #11621 - correctly zero-initialize padding bits in bitpacking compression by @Mytherin in #11671
- Fix #11542 - correctly check if a column data segment has updates, and clean up the updates by @Mytherin in #11670
- Make UNION BY NAME also use ForceMaxLogicalType, similar to UNION by @Mytherin in #11665
- Fix extension_version propagation for external extensions by @carlopi in #11672
- Allow decimal type in CSV auto_type_candidates option by @pdet in #11675
- Fix #11484: support constant indexes in ARRAY - e.g.
ARRAY(SELECT .. ORDER BY 1)
by @Mytherin in #11674 - Improve hive type auto-casting so that it looks at all files instead of only the first file by @Mytherin in #11676
- Fix #11669: deduplicate column names in pivot correctly by @Mytherin in #11678
- Disable setting console pages by default, and add .utf8 setting by @Mytherin in #11682
- bump vss, handle reverting append when index is unknown by @Maxxen in #11681
Full Changelog: v0.10.1...v0.10.2