This preview release of DuckDB is named "Spectabilis" after the King Eider
Binary builds are listed below. Feedback is very welcome.
Note: Again, this release introduces a backwards-incompatible change to the on-disk storage format. We suggest you use the EXPORT DATABASE command with the old version followed by IMPORT DATABASE with the new version to migrate your data. See the documentation for details.
Below a list of changes in this release
Features
- #2393: Switch to Push-Based Execution Model
- #2417: Add support for GROUPING SETS, ROLLUP, CUBE and GROUPING/GROUPING_ID
- #2347: Support for ENUM Types & #2404: Native mapping between R factors and DuckDB ENUMs
- #2419: Allow WHERE clause referring aliases defined in SELECT clause
- #2473: Adding Compression Option for Column Definitions to SQL Parser
- #2489: Implement MAD (Moving Absolute Deviation) Aggregate
- #2520: Add LIST_CONCAT and LIST_APPEND functions
- #2522: MSD (Most Significant Digit) Radix Sort
- #2529: Implement REGEXP_EXTRACT
- #2482: Add support for EVEN
- #2555: Adding BINARY_AS_STRING parameter to the parquet scan
Minor Changes & Bug Fixes
- #2377: Fix current_schema() and current_schemas()
- #2380: Add tests for "did you mean" error message
- #2381: Let FTS index creation respect the current schema.
- #2382: Dropping support to Python 2
- #2385: BLOB support for JDBC
- #2389: Replace getattr on PyBind11 classes with individual properties.
- #2390: Moving big categorical tests to tests_slow folder
- #2394: Add SET s3_endpoint support for in-house Ceph
- #2397: Add Python .pyi stubs
- #2402: GH Actions Upload: retry asset upload with timeout
- #2403: Not altering original DF when renaming columns in the binder
- #2405: Fix bug with enum::varchar cast on null values
- #2408: Rewrite Arrow table register for R using replacement scans
- #2409: Adding dependency on Enum Types -> Tables
- #2412: Updated src readme to state we use push-based execution.
- #2413: Fix for #2411
- #2421: Fix #2407: use correct template parameters for DATE in arg_min/arg_max
- #2423: Fix #2416: fix binding issues related to binding parameters, null values, etc in list_extract and array_length
- #2424: Issue #2388: QUANTILE_DISC for VARCHAR
- #2430: Fix for #2426
- #2431: More fixes for #2416
- #2434: Issue #2388: Moving VARCHAR QUANTILE_DISC
- #2437: implement GEN_RANDOM_UUID
- #2438: Issue #2432: PERCENTILE_XXXX ignores DESC
- #2439: fix: pass absolute path to
System.load()
in Java - #2444: Fix #2440: correctly report run-time errors in Python client
- #2445: Several OSS Fuzz Fixes
- #2448: Move to codecov v2, and use add_library for vector operations for low RAM machines
- #2449: Enum to Enum Comparisons
- #2451: Hooold the loooock for Python Strings under Dataframe Object Columns
- #2453: Fix when getting single values from ENUMs
- #2455: Doc Improve: Trying to Update doc in
QueryResult
- #2456: Add --test-dir parameter to unittest so we can test out-of-tree extensions with it
- #2462: Fix #2452: Implement Coalesce instead of rewriting to CASE chain
- #2463: More OSS Fuzz fixes
- #2474: Allow Fetch of chunks containing a multiple of vector sizes for the Arrow Record Batch Reader
- #2476: Really holding the lock this time
- #2477: Sorted aggregate: only re-order when ordering count > 0
- #2485: Benchmarks: Handle comparisons for values that do not have a VARCHAR ->TYPE cast (e.g. complex/list types)
- #2487: Conditionally define UNLIKELY in Thrift
- #2488: ODBC: fetching the first chunk in SQLExecute
- #2490: Clean up RadixSort code
- #2491: These tests are now passing with arrow 6
- #2494: Emit full vectors from VALUES lists instead of emitting individual tuples
- #2497: Upgrade Catch to v2.13.7
- #2500: Fix nested string order
- #2501: Add CIFuzz action
- #2503: Fix for py string conversion on large strings
- #2504: More precision for
SUM
andAVG
- #2517: Move Kahan sum to separate method (fsum, sum_kahan)
- #2521: Issue #2515: Windowed quantile list
- #2527: testing: Add oss-fuzz fuzzer
- #2536: Fix #2518: in read_csv_auto don't override names if names have been provided
- #2537: Fix #2531: in recursive CTE avoid waiting for events to finish if an event has thrown an error
- #2539: Restructuring CI Workflow
- #2542: Issue #2530: Reset windowed lists
- #2550: ODBC: Running PSQLODBC tests on Win64
- #2556: Fixed directory separator bug
- #2558: Fixing positional reference binding in ORDER BY clause
- #2559: Issue #2552: General ordered aggregates
- #2561: Fix master CI: Python workflow needs auth tokens for deployment
- #2563: Move ExtensionHelper into main DuckDB Class
- #2564: R Client: Moving UTF encoding to R to avoid multithreading issues
- #2567: Fix #2538: crash in CSV auto-detect when reading ZSTD data
- #2573: Move HTTPFS builds to separate test to avoid deploying them by default on Linux
- #2574: Naming optimizer-created aggregates so plans are interpretable
- #2579: Support to Arrow 6
- #2580: Check magic bytes before checksum when opening a DuckDB database file
- #2585: Fix #2584: correctly handle edge cases for bigger than 1 increments in range table function
- #2587: Replacement Scans for Arrow Objects
- #2592: Fix #2577: Rework case statement to avoid rewrite into nested binary cases
- #2593: Fix #2591: avoid using ungrouped aggregate for non-combineable aggregates
- #2594: Fix #2588: timestamp -> date cast is not invertible
- #2595: Fix #2543: case insensitive replacement scans
- #2598: OSS Fuzz Fixes
- #2603: The asset upload script for releases was broken somehow
- #2605: Clean up reupload by splitting it into two functions
- #2604: Fix #2599: maintain correct dependencies between UNION ALL nodes so that output is deterministic/in-line with what a sequential execution would produce