This preview release of DuckDB is named "Fulvigula" after the Mottled duck (Anas fulvigula) which lives in the Gulf of Mexico, where it is apparently highly prized amongst (heartless) hunters.
There are two SQL-level breaking changes in this release:
- #7174 The default sort order switched from
NULLS FIRST
toNULLS LAST
because this is more intuitive, especially in conjunction withLIMIT
. - #7082 The division operator
/
will now always lead to a floating point result even with integer parameters. The new operator//
retains the old semantics. This change is consistent with Python.
Note: Again, this release introduces a backwards-incompatible change to the on-disk storage format. We suggest you use the EXPORT DATABASE
command with the old version followed by IMPORT DATABASE
with the new version to migrate your data. See the documentation for details.
What's Changed
- Issue 5984 #4 LogicalColumnIndex out of range Error by @Tmonster in #6303
- Implementing Integration with PyTorch by @pdet in #6295
- Implement #4941: Python client: for streaming fetches construct a streaming result (fetch_one, record_batch_reader, etc) by @Mytherin in #6346
- Implement sharable Buffer Pool across DatabaseInstances by @jkub in #6299
- Add table functions range and generate_series for TIMESTAMPTZ by @papparapa in #6285
- Add Initial DuckDB Swift API by @tcldr in #6351
- Integration with TensorFlow Tensors by @pdet in #6348
- Windows - remove delayload code and enable statically linking extensions by default by @Mytherin in #6399
- Add support for Pivot/Unpivot statements by @Mytherin in #6387
- [C-API] Add support for StreamQueryResult by @Tishj in #6318
- [Swift] add remaining non-composite types by @tcldr in #6422
- [Swift] Add Prepared Statements by @tcldr in #6459
- [Python] Exclude jemalloc files while pip install on Android OS by @papparapa in #6450
- CI: Swap cron for repository_dispatch by @carlopi in #6498
- CI improvements + add version badge to README by @carlopi in #6493
- Storage: store lists as uint64 offsets instead of as list_entry_t by @Mytherin in #6499
- two changes facilitating sending table/column stats over the wire (M… by @peterboncz in #6440
- Rework Value class internals to have a similar structure to LogicalType and others by @Mytherin in #6503
- Remove unswizzle flag from SortedData::Unswizzle by @lnkuiper in #6501
- [Swift] Add Appender by @tcldr in #6482
- JDBC: Remove DuckDBDatabase by @MariusVolkhart in #6426
- Add nan and inf arithmetic by @Tmonster in #6415
- Update
tools/rpkg
README.md by @Tishj in #6530 - Merge feature into master by @Mytherin in #6534
- Restrict threads for reliability. by @hawkfish in #6540
- Replace replace with format strings by @domoritz in #6542
- Add missing escape for " by @domoritz in #6543
- Blob <-> Bitstring casting by @LindsayWray in #6488
- Mapfunctions: map_entries, map_values, map_keys by @LindsayWray in #6522
- Issue #5920: Ordered Aggregate Buffering by @hawkfish in #6539
- Handle SQL-tagged strings correctly with dplyr::tbl, fixes #6506 by @rsund in #6536
- CI: Update Swift.yml by @carlopi in #6553
- Update SwiftRelease.yml by @carlopi in #6554
- Java: Implement JDBC 4.1 by @MariusVolkhart in #6376
- Bitstring aggregations by @LindsayWray in #6417
- Make our default
threads
setting Cgroup-aware on Linux by @Tishj in #6550 - [Swift] Add composite type support by @tcldr in #6557
- Statistics Rework: Switch to single BaseStatistics class, use separate static classes for methods on the stats instead by @Mytherin in #6560
- Introduce Syntax for SEMI and ANTI joins by @Tmonster in #6480
- Update storage_info with version 0.7.1 by @carlopi in #6572
- [Python] Add the ability to supply a DuckDBPyRelation instance to
register
by @Tishj in #6483 - [Python]
map
now defaults to original type when analyzed type at bind is NULL by @Tishj in #6571 - [Dev] Fix broken
test_filesystem.py
test by @Tishj in #6582 - CI: Node.js, add common NPM-setup step by @carlopi in #6590
- build: add builds for nodejs linux arm64 by @Mause in #6586
- CI: move to setup-node@v3 by @carlopi in #6596
- Issue #6604: TIMESTAMP <=> TIMESTAMPTZ by @hawkfish in #6605
- [Python] Add support for EXPLAIN ANALYZE to
explain
method by @Tishj in #6561 - Add ICU list functions generate_series and range by @papparapa in #6445
- feat(nodejs): add errorType attribute to DuckDbError by @Mause in #6434
- Fix TPC-DS date insertion by @ywelsch in #6591
- Fix #4016: Test amalgamation with --split param by @carlopi in #6587
- feat(python): throw HTTPExceptions instead of IOException for http errors by @Mause in #6533
- Add httpfs config to support packaging it as an extension by @ankrgyl in #6608
- Issue #6595: N-Ary Positional Joins by @hawkfish in #6598
- [Swift] inline documentation plus API tweaks by @tcldr in #6614
- Fix #6602: add inet extension to build/distribute script by @Mytherin in #6610
- CI remove amalgama x8 + swift release by @carlopi in #6615
- Fix too many open file handles during JSON schema detection by @lnkuiper in #6613
- Issue #6580: Parquet Int96 Timestamps by @hawkfish in #6601
- Exception_static_build defalt: Partial revert of dabbead by @carlopi in #6620
- Make DISTINCT ON respect the ORDER BY clause similar to Postgres + several ordered aggregate improvements by @Mytherin in #6616
- fix url encode issue for R2 by @samansmink in #6609
- [Swift] Database.Configuration type + documentation enhancements by @tcldr in #6617
- R: Avoid passing SEXP by reference by @krlmlr in #6475
- Test and fix preservation of class attribute in external pointers by @krlmlr in #6526
- Add support for lambda functions to
COLUMNS
, and allow COLUMNS to be used in the ORDER BY/WHERE clauses by @Mytherin in #6621 - [R] Remove duplicate occurrence of dependency by @Tishj in #6625
- Automatically Fully Download Files through HTTPFS if no length header is provided by @pdet in #6448
- Remove some function calls that can throw potential false positives in CI by @Tmonster in #6623
- [Python] Add
__getattr__
and__getitem__
implementations for DuckDBPyRelation by @Tishj in #6624 - [Optimizer] Regex Optimization Rule fix by @Tishj in #6634
- [Bug Fix] Enum Serialization by @pdet in #6040
- Update interval for arrow by @handstuyennn in #6515
- SQLLogicTest - instead of moving prepared statements over avoid restarting database when there are prepared statements by @Mytherin in #6638
- Bind replace table function by @samansmink in #6639
- Fix #6630: correctly set bind_data->types in the Parquet scan when using union_by_name by @Mytherin in #6642
- [Python]
read_csv
can now read from a file-like object. by @Tishj in #6568 - Fix #6640: correctly throw an error on altering schemas by @Mytherin in #6643
- Support multiple aggregates in top-level pivot by @Mytherin in #6644
- [DEV]: Fix clangd errors by @hawkfish in #6650
- Issue #6635: FIRST LAST NULLS by @hawkfish in #6648
- [DEV]: Unreachable window alias by @hawkfish in #6649
- Fix IsRegularCharacter() by @lokax in #6654
- [Swift] add Xcode playground Example by @tcldr in #6629
- Fix #6651: correctly update UpdateSegment references after transferring from transaction-local to committed data by @Mytherin in #6657
- Fix #6656: correctly add casts to NULL values in list_concat, and add more safety around stats mismatches by @Mytherin in #6658
- Fixing some tidy warnings by @taniabogatsch in #6661
- Fix c053bc8, unguarded std::thread by @carlopi in #6663
- Fix class name in error message by @papparapa in #6679
- Fix many fuzzer issues by @Mytherin in #6681
- Fix #6676 and #6677: correctly instantiate local states for nested casts by @Mytherin in #6688
- WebAssembly testing against duckdb-wasm latest stable version by @carlopi in #6665
- Support reading from presigned url by @douenergy in #6467
- Fix #6668: correctly report errors that occur during index appends by @Mytherin in #6693
- R: Remove RProtector class by @krlmlr in #6637
- Fix #6684: in the aggregate hash table, when we have very wide rows, default to HtEntryType::HT_WIDTH_64 by @Mytherin in #6689
- ColumnDataCollection - copy strings if DISALLOW_ZERO_COPY is enabled by @Mytherin in #6700
- Fix ossfuzz assertion triggers by @Mytherin in #6699
- Fix #6690: correctly handle NULL values in CSV auto-detection when decimal separator option is specified by @Mytherin in #6701
- Don't try to process validity mask for arrow null type columns by @cpcloud in #6702
- Adding Children and Step Options to TPC-H generator for BIG DATA by @pdet in #6535
- Add
json_serialize_sql
and first step of new Format(De)Serialization infrastructure. by @Maxxen in #6647 - feat(nodejs): Expose HTTPException as HTTPError by @Mause in #6655
- Add
regexp_extract_all
scalar function by @Tishj in #6685 - Storage: Lazily Load Row Groups from Tables by @Mytherin in #6715
- Add support for function chaining and the dot syntax for function calls by @Mytherin in #6725
- Implement JDBC unwrap methods by @tom-s-powell in #6718
- [Swift] Add sub-repo README.md by @tcldr in #6734
- Fix #6433 - avoid double recursion in pushdown of single/mark join by @Mytherin in #6740
- Make more pieces of pivot clause optional, and fix pivot alias issue by @Mytherin in #6731
- Add date_add alias to interval arithmetic by @Mytherin in #6726
- Add --root-dir option to benchmark runner by @Maxxen in #6739
- Add .col option to duckbox rendering in the shell by @Mytherin in #6748
- Add support for CREATE OR REPLACE SEQUENCE and CREATE OR REPLACE SCHEMA by @Mytherin in #6730
- Support recursive unnesting and unnesting of structs by @Mytherin in #6755
- Add support for pivoting on expressions by @Mytherin in #6758
- arrowIPCStream should return a promise by @domoritz in #6744
- Bug report: Add duckdb-wasm as potential alternative by @carlopi in #6794
- Anti/Semi Join fixes by @Tmonster in #6790
- Julia - Add support for streaming query results by @Mytherin in #6770
- Adding the option for the user to specify the column types searched in the CSV Auto Detect by @pdet in #6756
- Add GCD and LCM numeric functions by @kryonix in #6766
- Release the GIL when getting chunks for arrow results by @pdet in #6810
- Add to_hex/from_hex functions by @lokax in #6579
- Fix duckdb_result_chunk_count return description. by @Giorgi in #6813
- Issue #3207: ASOF JOIN Compilation by @hawkfish in #6719
- Fix #6603/#6799 - Index join fixes + fix verification check by @Mytherin in #6807
- Fix #2743 by removing NotImplementedException in CreateUnionPipeline by @kryonix in #6789
- [Swift] SwiftUI example project and type conversion utils by @tcldr in #6795
- Fix issue #6822 by instantiating TryMultiplyOperator for hugeint_t by @kryonix in #6824
- Moving HTTPState initializer to CleanupInternal by @pdet in #6819
- Map extract now allows composite (nested) types as
key
by @Tishj in #6552 - Issue #6728: Constant Windowed Aggregation by @hawkfish in #6772
- Parquet reader - fixes for reading non-microsecond TIME columns and delta_binary_packed encoded times/timestamps by @Mytherin in #6836
- Register function for Polars DFs by @pdet in #6825
- Storage: Add lazy column meta data loading, and fix issue where RowGroup::InitializeScan was called many times unnecessarily by @Mytherin in #6841
- Add support for named parameters in the API by @dacort in #6575
- Issue #5290: Rewrite ordered LIST by @hawkfish in #6741
- [Python] Fix crash in Jupyter environment related to progress bars by @Tishj in #6831
- Issue #6764: add "null_padding" option to pad rows in a CSV file with missing columns with NULL values by @Mytherin in #6765
- Enable BuildPipelines for nested recursive CTEs by @kryonix in #6838
- 2023a Time Zones by @hawkfish in #6844
- Normalize comparisons and improve string_t operations by @carlopi in #6381
- Fix #6856: correctly check cast cost of child element of list during function binding by @Mytherin in #6857
- Hash aggregate - switch partitioning threshold to MAX(total_groups) instead of SUM(total_groups), and limit number of partitions by @Mytherin in #6851
- Fix Parquet writer regression + add Parquet writing to regression test suite by @Mytherin in #6852
- [Python]
tuple
now gets properly converted to LIST, instead of a VARCHAR by @Tishj in #6868 - Implement predicates in JDBC DB-Meta class by @pjarra in #6866
- [Dev]: ICU 2023b TimeZones by @hawkfish in #6855
- [Python/Dev] Add implicit conversion from None -> duckdb.default_connection by @Tishj in #6839
- Add specific version of
clang-format
to the contributing guidelines by @Tishj in #6849 - ** search (crawl) for files in subdirectories by @lverdoes in #6627
- Modify show tables pragma query to respect current catalog scope by @rjatwal in #6816
- Issue #5920: Ordered Aggregate Performance by @hawkfish in #6867
- Do not enable jemalloc unconditionally by @jeroen in #6864
- Parquet reader - millisecond times are stored as int32 by @Mytherin in #6879
- Aggregate HT: Move intermediate structures to a separate AggregateHTAppendState, and avoid unnecessary resizing when many hash tables are created by @Mytherin in #6877
- [Python] Respect strides in 'object' column (string) to DuckDB conversion by @Tishj in #6878
- [Python] Add implicit conversion from
pathlib.Path
to string by @Tishj in #6835 - Ci wasm by @carlopi in #6886
- Include necessary C++ header by @david-cortes in #6900
- Wasm loadable extensions wip by @carlopi in #6889
- [Dev]: 2023c TimeZone Data by @hawkfish in #6905
- Adding definition for missing extension exception by @Dtenwolde in #6903
- Export window function as expression in relational api by @Tmonster in #6757
- [Catalog] Improve error message on catalog-qualified catalog-entry lookup by @Tishj in #6911
- fix for ODBC driver issues #4887 and #3801 by @bucweat in #6875
- Add support for transforming boolean tests by @hannes in #6928
- Support for missing GZIP features (extra field in header and concatenated files) used in BGZF by @rsund in #6817
- MultiFileReader - Provide unified methods for multi-file reader functions (Parquet, CSV, JSON) by @Mytherin in #6912
- Fixes an issue where CDPATH causes make to fail. by @marhar in #6940
- Add duckdb::make_uniq by @carlopi in #6950
- [Dev] Lock Pandas version in CI by @Tishj in #6958
- Bump duckdb-wasm to support duckdb::make_uniq by @carlopi in #6957
- Support for the ** operator in s3 by @lverdoes in #6930
- Add rel_to_sql method to convert relations to SQL again by @hannes in #6952
- [Safety] Add safety checks to
unique_ptr
access to guard access by @Tishj in #6891 - [Dev] Add missing header guard for
concurrentqueue.hpp
by @Tishj in #6915 - [Python - Chore] Update name of pybind11 type caster for doc gen by @Tishj in #6963
- Remove unnecessary code from the Python client by @Mytherin in #6972
- Faster PIVOT statement by @Mytherin in #6961
- CREATE TYPE creates an alias to a type - not an actual new type by @Mytherin in #6969
- [Safety] Remove C Style Casts by @Mytherin in #6967
- [Python] Fix issue related to objects that derive from
builtin.str
by @Tishj in #6978 - [Dev] Make
copy/csv/test_union_by_name.test
result deterministic by @Tishj in #6987 - Fix #6232 - for SQL value functions, only convert them into functions if there is no column with the same name by @Mytherin in #6982
- Fix #6990: When type has both num_children and type set, prefer the num_children - plus more defensive code in Parquet reader by @Mytherin in #6992
- Issue #6881: Window Memory Segfault by @hawkfish in #6984
- Issue #3207: LogicalAsOfJoin Deserialize by @hawkfish in #6983
- Issue #6959: TRY_STRPTIME Implementation by @hawkfish in #6960
- [Safety] Add safety checks to
vector
indexing by @Tishj in #6927 - Add json->sql deserialisation and execution. by @Maxxen in #6919
- [Python] Enable
rel[name]
andrel.name
syntax for struct fields by @Tishj in #6988 - LIST aggregate performance improvements by @Mytherin in #6995
- Treat MinGW as a different platform for extension loading purposes by @Mytherin in #7007
- Fixes #6775 Error scalar function by @ozdemircs in #6996
- feat(jdbc): stringify nested types by @Mause in #7000
- feat: standalone autocomplete extension by @Mause in #7010
- add support for scaning over numpy arrays by @vlowingkloude in #6523
- Rework Order Dependence Tracking in Pipelines by @Mytherin in #7006
- [Python] Fix crash related to file-like objects and
fsspec
by @Tishj in #7012 - Partially fixes #6936 - Avoid unnecessarily calling ToString in expression executor state by @Mytherin in #7018
- [Python] Fix datetime with tzinfo converting to naive TIMESTAMP by @Tishj in #7024
- Fix crash/error caused by importing an empty database. by @Tishj in #7025
- postgres_parser: use std::forward by @carlopi in #7038
- fixed an issue with ** operator by @lverdoes in #7040
- CI - Allow codecov uploads to fail by @Mytherin in #7043
- [DEV]: test_map_subscript reliability by @hawkfish in #7041
- Wasm loadable extensions by @carlopi in #7032
- WebAssembly.yml by @carlopi in #7030
- Issue #6959: ICU TRY_STRPTIME Lists by @hawkfish in #7031
- [External Buffer Manager] Step1: Split components from
buffer_manager.cpp
by @Tishj in #7028 - Issue #3207: ASOF Join Refactoring by @hawkfish in #7001
- [External Buffer Manager] Step2: Abstracting away the
atomic<idx_t>
counter by @Tishj in #7053 - Fix Julia BoundsError with arrays > 2048 by @frankier in #7055
- Issue #7013: Implement TRUNC by @hawkfish in #7036
- Add to_binary/from_binary functions by @lokax in #6848
- [Python] Extend
project
to accept a list of types + add DuckDBPyType class by @Tishj in #6777 - Ci wasm by @carlopi in #7072
- [Optimizer] Fix
regexp_matches
(again) by @Tishj in #7075 - [Safety] Remove many C-style pointers by @Mytherin in #7080
- [External Buffer Manager] Step3:
BufferManager
interface,StandardBufferManager
implementation by @Tishj in #7078 - Issue #6882: REGEXP_EXTRACT Capture Groups by @hawkfish in #6918
- [feature] Add Damerau-Levenshtein string comparison function by @ADBond in #7035
- Logical Get children should be optimized as well by @Tmonster in #7046
- [BREAKING] Use Python-style division operator (/ is always floating point division, // is integer division) by @Mytherin in #7082
- Issue #6861: Index out of bound for all-NULL case. by @xuke-hat in #7070
- Issue #5920: Ordered Aggregate Sorting by @hawkfish in #6986
- Decode DuckDB blobs as buffers in Node UDF args by @matt-allan in #7059
- Partitioned file naming by @lverdoes in #6791
- fix: accept either AWS_REGION or AWS_DEFAULT_REGION by @OhmniD in #7090
- Kitchen sink related to duckdb-wasm WIP by @carlopi in #7074
- Pb/catch stacktrace by @peterboncz in #6991
- [Python] Fix nightly build failure by @Tishj in #7104
- Possibly fixing R strict barrier issue by @hannes in #6974
- Change chunk_size parameter to approx_rows_per_batch by @pdet in #6840
- Add interrupt() to jdbc by @zhangyt26 in #7058
- Bump Julia package to v0.7.1 by @Mytherin in #7109
- R: Add duckplyr tests by @krlmlr in #7097
- [Safety] More C-style pointer removal by @Mytherin in #7108
- Disable format_uuid for vsize=2 by @Mytherin in #7115
- Fix #7096 - allow specifying a column list for VACUUM without ANALYZE by @Mytherin in #7110
- Fix #7093 - correctly extract table names even when tables are present in the catalog by @Mytherin in #7111
- Tuple Data Collection by @lnkuiper in #6998
- Fix #7083 - correctly reset delta offset when reading a new delta byte array page by @Mytherin in #7112
- Bump wasm version by @carlopi in #7121
- Using Parallel CSV Reader as a Default Option by @pdet in #6977
- Upcast Enum to String in Coalesce Function by @pdet in #7114
- ADBC - Arrow Database Connectivity - Integration by @pdet in #7086
- Timestampformat also for timestamps with timezones by @pdet in #7130
- Remove dependency of arrow import with dataset module by @pdet in #6809
- [Safety] Even more C-style pointer removal by @Mytherin in #7131
- Accidentally pushed timestamp date with current_date instead of fixed… by @pdet in #7148
- string_t - rename GetDataUnsafe to GetData by @Mytherin in #7151
- Coalesce expression operator should propagate null by @douenergy in #7140
- Issue #7128: Fuzzer DATE_DIFF Overflow by @hawkfish in #7137
- Issue #7147: TIMESTAMPTZ to DATE by @hawkfish in #7150
- Fix floating point error in SKEW by @lnkuiper in #7146
- feat(jdbc): set{Schema,Catalog} by @Mause in #7158
- Split ** tests up into two files by @lverdoes in #7159
- Arrow Blob Filter Pushdown by @pdet in #7164
- Fix #7124 - correctly transform order by/limit in pivot/unpivot statements by @Mytherin in #7163
- [Safety] Replacing pointers with references/optional_ptr in the Binder by @Tishj in #7136
- Fix kurtosis on macOS by @lnkuiper in #7165
- Correctly zero-initialize all unused memory in storage blocks, plus add CI run to ensure all memory is correctly initialized by @Mytherin in #7175
- Fix rel to sql by @Tmonster in #7172
- Update swift CI run to always push & publish a tag by @Mytherin in #7179
- [BREAKING] Switch to NULLS LAST as default null sorting order, instead of NULLS FIRST by @Mytherin in #7174
- Issue #3207: ASOF Physical Joins by @hawkfish in #7153
- Run ADBC tests on windows by @pdet in #7185
- feat(jdbc): support TIME_TZ by @Mause in #7193
- Fix ASOF join test null ordering by @Mytherin in #7195
- [Python] Add support for Pandas 2.0.0 by @Tishj in #7005
- [Safety] Remove C-style pointers in Catalog, use references whenever possible by @Mytherin in #7203
- Default allow caps to false by @Tmonster in #7201
- Fix the
lineitem
table schema definition error of TPC-H by @r4ntix in #7099 - Fix #7219 - we cannot use the ungrouped aggregate if there are multiple grouping sets (even if they are all empty) by @Mytherin in #7234
- Move several tests to slow tests by @Mytherin in #7249
- [TPC-DS] Fix issues in data generator (#7222, #7223, #7225) by @Mytherin in #7247
- Issue #7230: Named Window Overrides by @hawkfish in #7243
- Correct license code in nodejs project by @whscullin in #7241
- Issue #7220 - add support for DEFAULT VALUES clause in INSERT INTO by @Mytherin in #7240
- Fix #7235 - correctly detect invalid statistics for decimal type by @Mytherin in #7238
- Fix #7119/#7120 - correctly do a case insensitive comparison in foreign key REFERENCES by @Mytherin in #7236
- [C-API] Add
duckdb_string_t
for use with the data chunk API by @Tishj in #7180 - [CSV Reader] Allow quoted nulls by @pdet in #7210
- Towards buffer managing the ART - no more tiny allocations by @taniabogatsch in #6951
- Implements #7118 - support REFERENCES syntax for single column references by @Mytherin in #7237
- Fix spurious CI failure by @Mytherin in #7257
- In the parallel CSV reader, prevent buffering of data unnecessarily when reading from compressed files by @Mytherin in #7253
- fix(JDBC): push down update count calculation into execute() method by @Mause in #7242
- Issue #7013: Implement getTimestamp Calendar by @hawkfish in #7276
- fix(jdbc): return valid class names from getColumnClassName by @Mause in #7262
- fix(adbc): crash when setting database option due to malloc by @zeroshade in #7268
- build: Node 20 builds by @Mause in #7286
- [Dev] Rename
ClientProperties
propertytimezone
->time_zone
by @Tishj in #7258 - Add ExtraTests CI run that can be manually triggered to run all benchmarks and compare to last release by @Mytherin in #7287
- Reset parsed_chunk when figuring out new line in Parallel CSV Reader by @pdet in #7284
- fix: add catalog information to the serialization of a few logical operators by @stephaniewang526 in #7270
- [Python] Fix #7269 by @Tishj in #7301
- [Python] Add
by_name
option toconnection.append
method by @Tishj in #7300 - Fix affected row count returned from
INSERT .. ON CONFLICT (..)
statement by @Tishj in #7259 - Parquet metadata functions - correctly check for isset on various properties by @Mytherin in #7289
- Account for presence of varargs when casting table function arguments by @MarkRoddy in #7245
- [Python] Add optional
schema
option torelation.map
method. by @Tishj in #7197 - Force parallelism in R dataframe scans. by @Tmonster in #7181
- [Python] Add
:default:
option to get the default connection throughduckdb.connect()
by @Tishj in #7144 - Rework function registration, and move most scalar/aggregate functions to "core_functions" directory by @Mytherin in #7310
- Add ExtensionUtil class and move function registration to ExtensionUtil by @Mytherin in #7312
- [swift] Change Int to Int32 in DatabaseType array documentation by @indragiek in #7318
- [swift] Make LogicalType public by @indragiek in #7319
- Segmented signing checks on extensions by @carlopi in #7311
- chore: add newer extensions to default extensions array by @Mause in #7322
- Extend
format
andprintf
to support printing thousand separators similar to SQLite by @Mytherin in #7323 - Issue #7315: LocalFileSystem Glob FileExists by @hawkfish in #7316
- add
dayname
/monthname
functions fortimestamptz
type by @dylanscott in #7332 - [PythonDev] Fix Python regression test CI by @Tishj in #7338
- Simplifying initialization logic by @rjatwal in #7282
- More clear error message on mismatching files by @lverdoes in #7205
- Pivot - add support for custom subqueries in the IN clause of pivot entries by @Mytherin in #7333
- Improve error message when using pivot statement in views or macros by @Mytherin in #7328
- [swift] Make ResultSet.rowCount a public member by @indragiek in #7334
- [swift] Make Foundation extensions public by @indragiek in #7335
- Blocking Sink/Source operators by @samansmink in #7331
- Restore serialization of BaseStatistics distinct count by @bleskes in #7329
- Improve error message for unexpected constraint violations by @taniabogatsch in #7343
- Issue #3545: Fix Adar2 Crash by @hawkfish in #7346
- Extension signing: Fix #7311 by @carlopi in #7347
- TableCatalogEntry should allow customizing serialization but still be opinionated by @bleskes in #7350
- Add format_bytes function that formats bytes to a human readable size by @Mytherin in #7342
- Make
SQLLogicTestRunner::LoadDatabase
virtual by @Flogex in #7340 - [DEBUG] Add "debug_print_bindings" option to DBConfigOptions by @lnkuiper in #7288
- [Arrow] We always output the large buffers, for blobs, bytes, uuids and strings by @pdet in #7345
- Julia - Make method
destroy_data_chunk
public - streamed query results must be destroyed before the connection is destroyed by @Mytherin in #7361 - [Swift] add Int/UInt decoding to VectorElementDecoder by @tcldr in #7362
- Add Minimum Batch Index + Order Preserving Insertion Rework by @Mytherin in #7352
- Initialize HTTPFS state when extracting plans by @pdet in #7365
- Add support for parallel order-preserving CSV write by @Mytherin in #7368
- [Safety] Perform
vector
bounds checking on release builds by @Tishj in #7325 - [Dev] Fix some Minio boot problems + extend Makefile for use with extensions by @Tishj in #7363
- Add
BinarySerializer
,EnumUtil::
and generator script by @Maxxen in #7351 - Capture database type in config by @bleskes in #7359
- Change file exist check to is_pipe and do it in the bind by @pdet in #7354
- Column function chaining alias by @douenergy in #7313
- Relational set operations coerce to richer type by @Tmonster in #7256
- Autodetect hive_partitioning by @lverdoes in #7344
- Add missing rowsort to test by @Mytherin in #7370
- Fold some DistinctFrom + add bloaty (?) by @carlopi in #7374
- [Python] Add null_padding option to read_csv by @pdet in #7364
- Add support for parallel order-preserving Parquet write by @Mytherin in #7375
- fix: update serialization for logical_delete and logical_update by @stephaniewang526 in #7382
- Issue #7353: Filtered Constant Aggregates by @hawkfish in #7381
- Add
map_concat
function by @Tishj in #7360 - Add catalog parameter to dbgen / dsdgen by @ywelsch in #7378
- Fix Ubuntu 16 action: first compile OpenSSL, then Python by @carlopi in #7397
- [Python] Add scalar UDF, using
pyarrow
by @Tishj in #7171 - Add github actions to contributing.md by @douenergy in #7404
- Avoid double rollback caused by a constraint violation by @taniabogatsch in #7380
- Addings Tests and Fixes for Multiple CSV Issues by @pdet in #7379
- feat(jdbc): native array reading support by @Mause in #7369
- Print Error Lines in the Parallel CSV Reader by @pdet in #7184
- SQLite - Fix SQLiteScanner#45 by applying correct extension alias and upgrade SQLite extension by @Mytherin in #7405
- Correctly concatenate ART prefixes during deletions by @taniabogatsch in #7410
- Add support in the parser for
PREPARE COPY ...
by @Tishj in #7409 - Fix elusive unrecognized ART node type bug by @taniabogatsch in #7372
- Change exception type for invalid parquet by @ccfelius in #7402
- [Optimizer] Fix issue with COMPARE NOT EQUAL and cast overflow by @Tishj in #7413
- CI NodeJS: build and publish nightly for M1 by @carlopi in #7429
- Issue #7426: DuckDBVector getTimestamp by @hawkfish in #7428
- [Julia] Fix #7420 - Don't use
unsafe_string
inappender.jl
by @Tishj in #7427 - Correctly reset the ART keys during index joins by @taniabogatsch in #7425
- Remove FileOpener almost everywhere - instead wrap FileSystem in the ClientContext with an "OpenerFileSystem" by @Mytherin in #7423
- Make CSV error line numbers 1-indexed by @Maxxen in #7422
- Parquet: Check for valid UTF8 also in statistics by @carlopi in #7421
- Fix #7023 by @Tishj in #7419
- Fix #7263 by @carlopi in #7414
- [swift] Add a CodingUserInfoKey for accessing the LogicalType by @indragiek in #7371
- Implement JSON <-> Nested types casting by @lnkuiper in #7366
- CI - comment out failing CSV tests for now by @Mytherin in #7435
- Fix #7274 - correctly do a case insensitive comparison in UndoBuffer::Undo by @Mytherin in #7445
- Fix #7348 - In RowGroupCollection::RemoveFromIndexes - correctly account for the case where the row identifiers might not all be present in the same row group by @Mytherin in #7442
- Fix #6611 - List lambdas didn't support different vector types by @Tishj in #7424
- Swap Children of Logical ANY joins (or block nl joins) when possible by @Tmonster in #7437
- Add initialization of HTTPState to
TryBindRelation
by @Tishj in #7443 - Initialize the first two smallest plans when creating a cross product by @Tmonster in #7438
- Fix another index join bug and move to generated data by @taniabogatsch in #7441
- 7415 cross-product joins on parquet files by @Tmonster in #7455
- Support binding of ON CONFLICT clauses for extension tables by @Mytherin in #7447
- Add MAP {} syntax for easier map construction by @Mytherin in #7459
- Add support for
INTERVAL
type inBETWEEN
expression by @Tishj in #7461 - Lipo macos extensions to reduce their size by @samansmink in #7469
- Fix fuzzer issue 132 by @lnkuiper in #7456
- Fix unnest rewriter bug by @taniabogatsch in #7467
- [Safety] Enable
unique_ptr
safety checks on release builds by @Tishj in #7449 - Add support for array_to_string as an alias to list_aggr with 'string_agg' by @Mytherin in #7476
- Fix #7377 - correctly account for memory allocated in reset buffer of CSVFileHandle, and remove unnecessary caching for gzip files by @Mytherin in #7466
- Fixes 7439 and 7433 by @carlopi in #7454
- Signing binaries and extensions for OSX by @hannes in #7484
- Add support for INSERT INTO tbl BY NAME by @Mytherin in #7475
- link Out-of-tree extensions in node/R/python build + fix arrow extension by @samansmink in #7458
- Fix (de)serialization + enable serialization verification for more operators by @Mytherin in #7468
- disable assertions in release node binaries by @samansmink in #7487
- feat(jdbc): Statement#cancel() by @Mause in #7489
- Fix #6234 - throw invalid input exception when attempting to create non-temp entry in temp database, and disallow SET SCHEMA to temp/system schemas by @Mytherin in #7483
- Simplify COLUMNS with lambda -> operate only on column names, instead of qualified names by @Mytherin in #7499
- Fix #6666 - when reading an index in the CheckpointReader directly use the table entry by @Mytherin in #7481
- Expand SHOW ALL to include schema/database name, and add SHOW ALL TABLES alias by @Mytherin in #7500
- Fix #5777 - always read free-list of database, also in read-only mode by @Mytherin in #7501
- Run old CSV reader when reading many files by @Mytherin in #7490
- Build extensions for R on Windows using MinGW by @hannes in #7440
- JSON reader improvements/fixes by @lnkuiper in #7478
- Windows File System Unicode Fixes and correctly expand home directory in ATTACH/DBInstanceCache by @Mytherin in #7503
- JSON: Fix missing std::move by @carlopi in #7507
- [Dev] Add SQLString and SQLIndentifier helpers for
ExceptionFormatValue
by @Tishj in #7486 - Unsupported .help options removal by @lverdoes in #7488
- Remove even more unsupported options from the shell's .help by @Mytherin in #7511
- Increment julia version to v0.8.0 by @Mytherin in #7517
- Fixes #7504 and other minor spurious CI issues by @Mytherin in #7509
- Fix amalgamation builds avoiding linking utf8proc by @carlopi in #7512
- Change HTTPState to a shared_ptr so it doesn't get invalidated in prepared statements by @pdet in #7523
- [Dev]
unique_ptr
helper renames by @Tishj in #7516 - un-exporting sql() in R by @hannes in #7525
- Alias replacement scans to table name if no explicit alias is provided by the replacement scan by @Mytherin in #7526
- [Python]
read_json
API changes by @Tishj in #7505 - Fix minor benchmark errors by @carlopi in #7510
- Fix spurious CI /2 by @carlopi in #7515
- [UPSERT] Check for conflict constraint errors within a transaction by @Tishj in #7407
- Fix #7356 by @Tishj in #7417
- [Python] Fix GIL issue in
sql
with multiple statements by @Tishj in #7534 - Changing platform define for mingw by @hannes in #7533
- set the min os x to 11.0 for universal by @aprock in #7497
- Correctly shift row IDs during ART deletions by @taniabogatsch in #7538
- Add internal option to export small buffers to arrow strings by @pdet in #7540
- fix: correct format specifier by @Mause in #7544
- Add spatial extension to CI by @Maxxen in #7545
Full Changelog: v0.7.1...v0.8.0