duckdb 0.4.0 on Python PyPI

This preview release of DuckDB is named "Ferruginea" after the Andean Duck.

Binary builds are listed below. Feedback is very welcome.

Note: This release should be backwards-compatible wrt the on-disk storage format, but the next release may very well be incompatible again. So please don't rely on this just yet. We suggest you use the EXPORT DATABASE command with the old version followed by IMPORT DATABASE with the new version to migrate your data. See the documentation for details.

Also note: DuckDB is switching to semantic versioning. Version numbers look like this: MAJOR.MINOR.PATCH with changes to

MAJOR version when you make incompatible API changes,
MINOR version when you add functionality in a backwards compatible manner, and
PATCH version when you make backwards compatible bug fixes.

However, note that because MAJOR is currently 0, "Major version zero (0.y.z) is for initial development. Anything MAY change at any time. The public API SHOULD NOT be considered stable."

Below a list of changes in this release

Major Changes & Features

#3767: Table function rework, parallel Julia DF scans & Python regression tests
#3749 & #3747: Query cancellation with CTRL-C for R and Python clients
#3700: Support Parallel Order-Preserving Result Set Materialization
#3696: Support WINDOW FILTER
#3620: HTTP read optimization
#3668: Adding alias type
#3435: Add support for reading newline-delimited JSON
#3783: Extension loading by statically linking DuckDB

Minor Changes & Bug Fixes

#3905: Fix SQLancer CI
#3904: Fix #3896: correctly compute GroupRowsAvailable in struct reader in case a child-entry is not just a list, but a struct with only list entries
#3902: Fuzzer: fix sanitization of address sanitizer error
#3901: R: Extract DetectLogicalType() function
#3899: R: Check query return type instead of query type in dbFetch()
#3898: Issue #3880: Rebind DATE_TRUNC dates
#3894: Purge concurrent queue when enqueueing entries to prevent entries from piling up
#3892: Fix for issue #3878
#3889: Fix TreeRenderer crash on invalid UTF8
#3888: Julia Table Functions: add stack trace to errors reported
#3887: Correctly reset interrupted flag so verification does not overwrite original error
#3886: Remove the check_tread from python connection
#3879: Avoid title is too long error in fuzzer issue submission
#3877: Fix use-after-free in create view with prepared statement parameter
#3872: Glob with search paths
#3871: [Python] Making new connections to cursors and adding lock on queries over sampe connection
#3869: Several OSSFuzz fixes
#3865: Fix #3860: add support for creating foreign keys on temporary tables, and for now disable support for cross-schema foreign keys
#3863: Out-of-tree Extensions for Windows
#3862: Rework of Struct <> Dictionary Vectors, and add test_vector_types function
#3852: Added support for generated columns to TableCatalogEntry->ToSQL()
#3850: Enable EXTENSION_STATIC_BUILD for Mac too
#3849: [Python] Unbundle Substrait
#3848: Parquet: fix for fixed length byte arrays in dictionary column reader
#3847: Expand oss-fuzz tests to run queries and check for internal errors
#3846: Pass through read only flag for node connector
#3845: Add queries over Arrow to Python regression tests, and time entirety of TPC-H
#3843: [JDBC] Pass through scale and precision for decimal types from DuckDBColumnTypeMetaData
#3842: Allow to use custom memory allocator through DuckDB API on Windows
#3837: Fix overflow in generate_series and overflow in abs operator
#3832: Issue #3816: Parquet Time Zones
#3831: s3fs decode keys correctly
#3828: Update testthat snapshots
#3818: Add SQLancer to CI Fuzzing Framework
#3815: Out-of-tree Extension Builds
#3812: Fix several issues found by Valgrind
#3810: DuckDB.jl Julia Package History
#3809: Add shell: bash everywhere
#3802: fix ci breaking from extension PR
#3799: Optimisation rule for regexp_matches with literal pattern
#3798: Substrait: Adding more compatibility with Substrait and Ibis
#3792: Issue #3790: Temporal IsFinite/IsInf
#3791: Issue #3721: Rightshift Negative Hugeint
#3786: Fix binding of fully qualified view reference
#3785: Python: Allowing cursor to set check threads flag
#3784: Improve speed of ALTER TABLE ADD COLUMN
#3778: More node types
#3777: Python: Updating Stubs and Bringing Stubs tests back
#3776: Simplify clangd target
#3775: Expose dbgen speed_seed functions on header file and add missing ones
#3771: Increment R package version
#3765: Issue #3759: Node Time Zone
#3764: Issue #3763: List Min/Max Problems
#3761: Fix .import not creating missing table in CLI
#3760: Requiring keys provided to map to be unique
#3757: Fix #3756: fix issue when running blockwise NL join on dictionary vectors of structs
#3752: Fixed error handling for node exec()
#3751: Decreasing the overallocation for list aggregates
#3750: Fix a bug in HyperLogLog
#3746: Check if replacement scans don't leak memory
#3745: Arrow/Pandas Case Insensitive Columns
#3744: Treating ENUM Case in pyresult describe
#3739: DuckDBPyRelation: support offset argument for limit()
#3738: Fix #3730: avoid modifying the payload in-place in aggregate hash table, because it might be used multiple times in case of grouping sets
#3736: JDBC better error handling
#3733: Progress bar clean-up: fix thread sanitizer issue, and move progress bar code to individual operators
#3720: Issue #3515: Add statistical rounding
#3707: Fix #3702: avoid assertion that we are not storing internal entries in the file
#3706: Implement sqlite3_file_control and sqlite3_sleep
#3705: Add support for ENUM converted types in the Parquet reader
#3699: Zero-copy scans for non-list uncompressed segments
#3695: Only rename pandas columns that have duplicates
#3692: Compatibility with dev dbplyr
#3691: Fix #3690: correctly assign catalog set to default objects to avoid crash when used as dependency
#3681: R: Fail CI/CD on NOTEs, check examples on UBSAN, log valgrind output
#3677: Fuzzer fix: avoid reporting non-internal errors
#3676: More ccache removal from OSX Extension Release
#3675: More extensive SQLLogicTest testing, and temporarily disable OR pushdown
#3667: Handling dataframes with repeated names in columns outside the bind. Now when registering df for scan.
#3665: Delete correct revision in pypi cleanup script
#3664: try/except in pypi cleanup
#3663: Return PY registered objects from temporary views
#3662: Remove CCache from the OSX Extensions Release build
#3661: Automatic PyPI cleanup in CI
#3653: Fixing enum comparison at where clause to TRY_CAST
#3652: to issue#3475 optimize CSG & CMP enumeration of join order optimizer
#3650: Issue #3610 mem leak
#3648: Julia DataFrame Scan Performance Improvements & TPC-H Tests
#3646: ODBC: adjustments because of ADO
#3643: Fix for #3639, dont use string copy and value api to fill factor vector
#3635: Avoid running approx quantile with vsize=2
#3634: Fix some issues with the fuzzer auto-closing issue behavior
#3633: Add default type generator, move built-in types to default type class and improve error reporting for types
#3632: Check for div by zero in distinct stats
#3630: Fix issue 3611
#3629: S3 Minio fix
#3628: Issue #3625: Adding canonical guards around Arrow CData Interface
#3624: Add interval to DBAPI description
#3615: Fix #1785: correctly copy constraints in ADD COLUMN of alter table
#3614: Correctly propagate what a statement returns from the binder
#3613: SQLSmith fuzzer fixes
#3612: SQLite UDF fixes for writefile and friends
#3609: Fix operator precedence of ** in the parser
#3608: Turn the expression depth limit into a configureable parameter
#3607: Implements enter and exit functions on pyconnection to allow the use of context managers
#3606: Use Python 3 for configuring R
#3604: Equal or null optimization
#3603: Fixing ascii bug in histogram strings
#3602: Support for Arrow Timezone
#3598: Add auto-commit off to JDBC Connection
#3594: Issue #3588: Half constant BETWEEN
#3592: Issue #3444: Approximate quantile lists
#3589: Issue #1187: Virtual Generated Columns
#3576: More compliant with substrait and upgrading version up to 0.1.2
#3575: Issue #3534: Remove TIMESTAMPTZ casts
#3574: Issue #3430: Temporal Infinity Values
#3571: Fixing JNI, matching function signature exactly
#3569: Implicit struct_pack
#3564: Fix for #3562
#3551: Issue #2309: Update benchmark info in README.
#3550: ICU Extension Rework: clangd for extensions
#3547: Issue #3273 support multistatments for JDBC driver
#3546: Issue #2910: Support pandas boolean datatype
#3533: Exit with the correct exit code in the regression test runner
#3531: Correctly increment list offset on histogram aggregation
#3528: Julia Client - re-enable parallelism by executing tasks on dedicated Julia threads
#3524: Rework table-in-out function API, and move Unnest table function to table-in-out function
#3523: Improve HyperLogLog
#3519: Support in-place updates for unsigned integers
#3516: Issue #3497: Round DECIMAL casts
#3514: Issue #3453: Window Partition Collections
#3512: Issue #3418: Match Multiple Spaces
#3511: Fix #3505: Correctly handle Foreign Key syntax for when primary-key columns are not specified
#3507: Fix merge conflicts
#3504: ODBC: issue #3398
#3503: ODBC: issue #3478
#3502: Random-value generation clean-up, and move aux data in client context to separate ClientData class
#3500: Bug fixes for ENUMs
#3498: Relational API basics for R client
#3495: R: support structs
#3481: List distinct and list unique functionality
#3474: Unified BufferedCSVReaderOptions parsing
#3470: Force aggregates to have a Combine method, expose bind data in combine & general bind data clean up
#3469: Add duckdb.lib to Windows release package
#3467: ODBC: PowerBI showing column headers
#3464: CSVReader option 'ignore_errors'
#3456: Add C API functions to build list/map types and read map types
#3454: CMake install DLL file on Windows platform
#3442: ICU Extension Rework: No longer use ICU amalgamation, and update ICU data to 71
#3437: Implement JNI class, method and field caching
#3420: Expose get table names from conn to python
#3416: R extension loading
#3410: Turn SQLSmith into an extension, add CI fuzzing framework, and add automatic SQL test case reduce functionality
#3405: Issue 3403: Logical Type Append
#3389: Issue #3187: Implement strptime_icu
#3388: CI: Use ccache and clang-tidy-cache
#3386: Issue #3384: DATE_TRUNC for INTERVAL
#3382: Fixing python dependency memory leaks
#3375: Rebind prepared statements in case of type ambiguities, rather than default to VARCHAR
#3346: list_sort function support

duckdb 0.4.0 0.4.0 Preview Release "Ferruginea" on Python PyPI

Major Changes & Features

Minor Changes & Bug Fixes

duckdb 0.4.0
0.4.0 Preview Release "Ferruginea"

on Python PyPI