This release of DuckDB is named "Histrionicus" after the good-looking Harlequin duck (Histrionicus Histrionicus) that inhabits "cold fast moving streams in North America, Greenland, Iceland and eastern Russia".
Please also refer to the announcement blog post: https://duckdb.org/2025/02/05/announcing-duckdb-120
What's Changed
- Optimise division by a constant at runtime for integer division by @JAicewizard in #10348
- Add cross join to Python Relational and PySpark API by @khalidmammadov in #13519
- Fix #13805: throw a more descriptive error message when an on-disk file is referenced using a replacement scan for an unsupported file format by @Mytherin in #13871
- Make sampling accept parameters at the parser/transformer layer by @Mytherin in #13903
- Fix #13867: use 64-bit random numbers to generate random numbers for
random()
by @Mytherin in #13920 - Fix #13769: when binding views, always first search in the schema that the view is defined in by @Mytherin in #13921
- Rework table bindings to be components (
catalog
,schema
,table
) instead of flat strings by @Mytherin in #14017 - Add auto-loadable extension settings to duckdb_config_count and duckdb_get_config_flag by @Mytherin in #14021
- Fix #10961 - in the HAVING clause - in case of column name conflicts, bind to aliases instead of to ungrouped columns by @Mytherin in #14023
- Enable filter pushdown through Logical Unnest by @Tmonster in #14008
- Allow duplicate table aliases in the table binder by @Mytherin in #14035
- Unify DESCRIBE [query] and DESCRIBE [table] by @Mytherin in #14039
- Support qualified identifiers in the
EXCLUDE
clause by @Mytherin in #14043 - Add
SMALLER_BINARY
flag to reduce binary size by @Mytherin in #14057 - Smaller Binary: remove more templates from arg_min_max by @Mytherin in #14071
- Unify entropy and mode aggregates - and skip specialized implementations for entropy with smaller binary by @Mytherin in #14080
- [Python] Add
set_default_connection
to theduckdb
module by @Tishj in #13442 - Provide workaround for prefetching parquet files with incorrect page offsets by @samansmink in #13697
- Move
core_functions
to a separate extension by @Mytherin in #14149 - PySpark df.drop() to support expressions by @khalidmammadov in #14059
- add some RealNest benchmarks by @hmeriann in #13345
- feed table function into multifilereader initialization by @samansmink in #14112
- [Dev] Fix an issue causing ExecuteTask to do much more work than intended by @Tishj in #14034
- Overhaul Parquet dictionary handling by @hannes in #14194
- [Feature] Allow passing the catalog (database name) to appender by @taniabogatsch in #13692
- Add Taxi Dataset Benchmark by @pdet in #14197
- Feature #3036: Window Spooling by @hawkfish in #14181
- Small C Extension API changes by @samansmink in #13987
- Add HTML and Graphviz support for explain analyze by @abramk in #13942
- Fix #13064: offer more suggestions with same score by @Damon07 in #14048
- New Algorithm to find a new line on parallel execution by @pdet in #14260
- Making client context lock optional for relation binding by @pdet in #14093
- [Feature] Allow passing the catalog during C API appender creation by @taniabogatsch in #14256
- Make test random output ordered by @Damon07 in #14267
- Skip test_window_distinct by @Mytherin in #14309
- Taxi Benchmark by @pdet in #14301
- Switch to shared pointer for multfilelists by @samansmink in #14291
- Push #14298 to feature branch by @flashmouse in #14311
- Implement PullUp Empty Results optimizer by @Tmonster in #13524
- [Export/Import] Use the DependencyManager to (stable) sort the entries before export by @Tishj in #14196
- Partitioning-Aware Aggregation and Partitioning-Aware Infrastructure by @Mytherin in #14329
- Add df.unionByName to PySpark API by @khalidmammadov in #14063
- Or filter pushdown into zone maps by @Tmonster in #14313
- Get the current setting in the database file opener by @Mytherin in #14361
- [Feature + Fix] Support ALTER TABLE tbl ALTER col TYPE USING and fix null handling in struct_insert by @taniabogatsch in #14359
- [C API] Add table_description_create_ext and table_description_get_column_name by @taniabogatsch in #14285
- Move _rtools platform to be equivalent to _mingw by @carlopi in #14368
- Fix for accidental like skip in the CSV Buffer by @pdet in #14380
- Table locks - always grab table locks through the transaction interface by @Mytherin in #14379
- Implementing array_slice and [] for BLOB by @hannes in #14358
- Rework settings handling and implement auto-generation for new ones by @Mytherin in #14383
- Rework settings handling and implement auto-generation for new ones by @chrisiou in #14018
- Arrow list buffer - suggest setting
arrow_large_buffer_size
to true when regular list buffer size is exceeded by @Mytherin in #14384 - Fix incorrect merge conflict resolution in workflow file by @Mytherin in #14390
- Update Parquet Thrift to latest version by @hannes in #14258
- Reformat list functions by @c-herrewijn in #14372
- Tidy Check to do complete run also on feature by @carlopi in #14394
- [Python] Use an
ArrowQueryResult
inFetchArrowTable
when possible. by @Tishj in #14319 - Make mysql_scanner auto-loadable, and add mysql/postgres secrets by @Mytherin in #14392
- Improvement the speed of table sample systems by @continue-revolution in #12631
- Support defining column names in CTAS by @douenergy in #14327
- Fix pointer indirection in pyrelation.cpp by @carlopi in #14403
- Fix idx_t to int64_t implicit conversion flagged by clang-tidy by @carlopi in #14402
- Storage: make
ROW_GROUP_SIZE
configurable by @Mytherin in #14406 - [Dev] Update vendored ZSTD to v1.5.6 by @Tishj in #14360
- Top-N: Rework to use heap of sort keys by @Mytherin in #14424
- reformat string functions by @c-herrewijn in #14400
- Prefix Aliases in SQL by @hannes in #14436
- [Dev] Optimize
ValidityMask
when reading from aColumnDataCollection
by @Tishj in #14416 - [Dev] Further optimize the CDC ValidityMask deserialization by @Tishj in #14448
- Reformat date and map functions by @c-herrewijn in #14425
- Reformat generic functions by @c-herrewijn in #14423
- Push dynamically generated join filters through
UNION
,UNNEST
andAGGREGATE
by @Mytherin in #14453 - Try auto-casting for mismatching data chunks in the Appender API by @taniabogatsch in #14433
- Implement
DELTA_BINARY_PACKED
compression in Parquet writer by @lnkuiper in #14257 - Eviction Queue Partitioning by @lnkuiper in #14375
- Implement
map_extract_first
by @lnkuiper in #14175 - RowGroup no longer lives in format namespace by @Mytherin in #14469
- Convert the shell from C to C++ by @Mytherin in #14473
- Fixing an issue with parquet dictionary reading by @hannes in #14438
- Strip down unused/unsupported options from the CLI by @Mytherin in #14478
- [PySpark] Add withColumns, withColumnsRenamed, cos, acos, any_value, approx_count_distinct and various array functions by @binste in #14347
- CLI Code Cleanup: move all shell functions into the ShellState by @Mytherin in #14483
- CLI Code Cleanup: Move rendering logic into separate Renderer classes by @Mytherin in #14485
- Reformat compressed materialization functions by @c-herrewijn in #14470
- Internal #3273: Shared Window Expressions by @hawkfish in #14450
- CLI Code Cleanup: rework metadata commands in the shell by @Mytherin in #14503
- CSV Parallel Reading Validation by @pdet in #14439
- Avoid recompilations of duckdb when there are no actual changes by @carlopi in #14176
- Add
-safe
mode to shell which disables external access, and remove SQLite UDFs from the shell by @Mytherin in #14509 - [PySpark] Add functions covar_pop, covar_samp, call_functions, endswith, startswith, exp, factorial, log2, ln, degrees, radians, atan, atan2, tan, round, bround by @binste in #14454
- Reformat arithmetic operators by @c-herrewijn in #14489
- add attach with default tables by @samansmink in #14118
- Add duckdb_param_logical_type by @Giorgi in #14515
- Remove most BUILD_ options for extensions, using CORE_EXTENSIONS by @carlopi in #14531
- CLI: more code clean-up by @Mytherin in #14551
- Reformat nested and sequence functions by @c-herrewijn in #14495
- Parquet: Fixing selection vector calculation by @hannes in #14558
- CLI: Fix for .mode markdown rendering after refactor by @Mytherin in #14569
- Out-Of-Core Updates & Deletes by @Mytherin in #14559
- Manage
enable_external_access
at the FileSystem level, and addallowed_paths
andallowed_directories
option by @Mytherin in #14568 - feat(iejoin): use sort to replace binary search in iejoin by @my-vegetable-has-exploded in #14507
- Clean-up distinct statistics - add hashes cache add the Append and Vacuum layers, and remove unnecessary lock by @Mytherin in #14578
- [PySpark] Test Spark API with actual PySpark as backend by @binste in #14526
- Internal #3273: Shared Window Frames by @hawkfish in #14544
- Reformat aggregate functions by @c-herrewijn in #14530
- Expose threshold argument of Jaro-Winkler similarity by @zmbc in #12079
- No pushing filters below projections that cast to a lower logical type id by @Tmonster in #13617
- Implement
left_projection_map
for joins by @lnkuiper in #13729 - remove superfluous comment by @c-herrewijn in #14586
- [Dev] Make the
regression_test_runner
easier to replicate by @Tishj in #14557 - [PySpark] Add dataframe methods drop_duplicates, intersectAll, exceptAll, toArrow by @binste in #14458
- Internal #3381: Window Race Condition by @hawkfish in #14599
- Rework generated EnumUtil code by @Mytherin in #14391
- Force aggregate state to be
trivially_destructible
, unlessAggregateDestructorType::LEGACY
is used by @Mytherin in #14615 - AWS - remove expected error message by @Mytherin in #14633
- Temp directory compression by @lnkuiper in #14465
- Add support for SELECT * RENAME by @Mytherin in #14650
- [PySpark] Add autocompletion for column names to dataframes by @binste in #14577
- Force aggregate state to be
is_trivially_move_constructible
by @lnkuiper in #14640 - Correctly render EXPLAIN EXECUTE - use op.GetChildren() instead of hard-coding special cases by @Mytherin in #14651
- Buffer Manager - Make DestroyBufferUpon atomic by @Mytherin in #14656
- proposed enhancements to the query graphs by @peterboncz in #14637
- Sampling respects seed from random number generator if no seed is given. by @Tmonster in #14374
- Blockwise NL Join: Return control on every iteration in
ExecuteInternal
by @Mytherin in #14658 - feature(spark): add hex and unhex functions by @spenrose in #14573
- Support
SELECT * LIKE '%col%'
syntax by @Mytherin in #14662 - feature(spark): add base64 and unbase64 function by @spenrose in #14561
- Fix #14663: correctly propagate null values in list concat operator by @Mytherin in #14675
ALTER TABLE ADD PRIMARY KEY
by @taniabogatsch in #14419- Merge feature into main by @Mytherin in #14690
- Support for CSV Encoding (UTF-16 and Latin-1) by @pdet in #14560
- Fix #14699 - Correctly handle SHOW TABLES in views by @Mytherin in #14705
- Fix #14701 - avoid flattening in-place in ColumnData Append method by @Mytherin in #14708
- Use TryCastAs instead of DefaultTryCastAs in comparison_simplification by @Mytherin in #14711
- Value interface & serialization clean-up by @Mytherin in #14710
- Fix various nightly CI issues by @Mytherin in #14720
- CLI: Add support for
.thousand_sep
and.decimal_sep
by @Mytherin in #14721 - Propagate collations through functions in a generic manner by @Mytherin in #14717
- Add functions for handling null duckdb_values by @Giorgi in #14687
- adaptive filters should not reorder filters that can throw by @Tmonster in #14672
- [Python] Add
LambdaExpression
to the Python Expression API by @Tishj in #14713 - Add fallback for thread count if jemalloc cannot identify by @lnkuiper in #14688
- csv: parse escape character in unquoted fields by @fanyang01 in #14464
- [Python][Expression API] Add the
between
method on theExpression
class by @Tishj in #14726 - [Attach][Macro] Fix issues identified with an attached macro by @Tishj in #14715
- Dont quote strings on csv files if quote='' by @pdet in #14731
- sqlite3_api_wrapper: avoid nullptr dereference by @ProjectMutilation in #14748
- Rework
BlockHandle
to no longer have friend classes, and reworkConvertToPersistent
so it fails if there are active outstanding pins by @Mytherin in #14746 - Revert "CMake: Avoid dependency-inducing codegeneration of extension headers" by @carlopi in #14723
- [PySpark] Add more functions such as ascii, asin, btrim, char, corr, ... and fix differences in ordering of null values between PySpark and DuckDB by @binste in #14738
- Added list value getters duckdb_get_list_child and duckdb_get_list_size by @prashanthellina in #14714
- [Python][Expression API] Add
collate
to create aCollateExpression
by @Tishj in #14749 - copy to operator still write schema for empty rows by @wenjun93 in #14524
- [Python] Use nullable dtypes in Pandas
DataFrame
creation when possible by @Tishj in #14377 - Update metrics generation script and include it in CI run by @taniabogatsch in #14756
- Add support for projection pushdown into struct fields by @Mytherin in #14750
- Optimistic writes: flush the last row group in all scenarios by @Mytherin in #14759
- Improve SqlStatement::ToString for UPDATE and DELETE statement to include alias of RETURNING clause by @HarshLunagariya in #14765
- Add JSON Logical Type metadata to parquet writer by @niger-prequel in #14747
- [Python] Add support for
Expression
tovalues
to create a ValueRelation by @Tishj in #14757 - Add missing global options to Python's
write_parquet
by @fr3fou in #14766 - Add operator name to profiling output by @ywelsch in #14744
- Detect catalog changes on DROP IF EXISTS by @ywelsch in #14742
- Correctly deal with continued operation after reading a truncated WAL, and clean up WAL handling logic in storage manager by @Mytherin in #14785
- [Fix] Error message in transaction manager by @taniabogatsch in #14788
- Initialize the grouping sets when there is a group by all to enable filter pushdown by @Tmonster in #14660
- Merge feature into main again by @Mytherin in #14793
- [Python][Expression API] Add
update
toDuckDBPyRelation
, acceptingExpression
objects | AddDefaultExpression
by @Tishj in #14780 - Fix #14540: fix unnest rewriter by @flashmouse in #14784
- [PySpark] Add approxCountDistinct, add_months, and various array functions by @binste in #14620
- Add syntax highlighting support for errors in the CLI by @Mytherin in #14799
- Implement #14787: allow expressions in the aggregate clause of a PIVOT statement, as long as the aggregate clause only modifies the aggregate result and does not contain other columns by @Mytherin in #14800
- When repeatable is set, set ParallelSink to false by @Tmonster in #14797
- [Catalog] Fix issue related to uncaught problems during a COMMIT by @Tishj in #14150
- [Upsert] Support non-distinct values in the inserted data by @Tishj in #14293
- Fix issue copying a TABLE that references a SEQUENCE by @Tishj in #14693
- fix duckdb_extension.h macros for C by @samansmink in #14808
- LTO CMake setting was not working anymore on MacOS, fixing that by @carlopi in #14811
- Add syntax highlighting support to the DuckBox query result by @Mytherin in #14820
- Avoiding unnecessary rebinding by @samansmink in #14616
- Support struct projection pushdown in Parquet files by @Mytherin in #14839
- Internal #3263: Window Distinct Deadlock by @hawkfish in #14775
- Issue #14737: DISTINCT ORDER Dependency by @hawkfish in #14840
- [Python][Dev] Skip
test_pandas_selection
on Python3.8 by @Tishj in #14851 - [Python][Dev] Fix issues with new/updated tests in the python sqllogictest implementation by @Tishj in #14850
- add function ends_with back by @Damon07 in #14859
- Require
capacity
in ValidityMask by @Mytherin in #14846 - Issue #11557: DECIMAL Downcast Rounding by @hawkfish in #14860
- Increase map inference threshold by @lnkuiper in #14848
- Output exception message on parse exception by @ackxolotl in #14852
- Use
LogicalTypeId::Unknown
instead ofLogicalTypeId::SQLNULL
for macro binding by @lnkuiper in #14809 - return InsertionOrderPreservingMap from TableFunction to_string by @samansmink in #14835
- Support default values when appending data chunks by @taniabogatsch in #14733
- [PySpark] Add a lot more functions incl. some regexp ones by @binste in #14761
- Added getters for enum and struct type values by @prashanthellina in #14831
- Fix write partition columns false by @ykskb in #14871
- Generate In-Clause filters from hash joins by @Mytherin in #14864
- Move FTS extension out-of-tree by @lnkuiper in #14872
- [C API] More tests and nits by @taniabogatsch in #14758
- Issue #14885: DATEPART Cache Bounds by @hawkfish in #14891
- Fix arrow table filters by @Tmonster in #14893
- [Python] Fix various issues uncovered by #12959 by @Tishj in #13149
- Remove some Snappy definitions by @lnkuiper in #14897
- [Fix] Binder exception when creating a foreign key on a view by @taniabogatsch in #14882
- [C API] Implement AddColumn and ClearColumns for the Appender by @taniabogatsch in #14880
- python: use PyUnicode_FromStringAndSize() by @methane in #14895
- Top-N: Improve performance with large heaps, and correctly call Reduce by @Mytherin in #14900
- Append to child column first in list column append by @Mytherin in #14902
- Update cardinality during limit pushdown by @jeewonhh in #14901
- Add
struct_concat
by @Maxxen in #14853 - [Compression] Add ZSTD compression by @Tishj in #14514
- Improve timestamp functionality by @taniabogatsch in #14818
- Fix #14833: split_part follow pg by @flashmouse in #14875
- C API: Add Value Relation constructor with RelationContextWrapper and ParsedExpression as argument by @anshuldata in #14892
- Issue #14734: Wrap Parquet TIMETZ by @hawkfish in #14908
- [Fix] release shared connection pointer before it goes out of scope by @roj516 in #14926
- [Fix] Nightly async build by @taniabogatsch in #14913
- [Tests] Re-enable test for vector verification run by @taniabogatsch in #14911
- Return timestamp with timezone in
read_text
/read_blob
by @Maxxen in #14925 - Fix several CLI issues by @Mytherin in #14929
- improve ReadAheadBuffer::AddReadHead error message by @stephaniewang526 in #14940
- Skip Dynamic Join Ordering Algorithm if there are many relations by @Tmonster in #14943
- remove failing benchmark by @hmeriann in #14945
- Typo in csv UnterminatedQuotesError how_to_fix_it by @bradleybuda in #14951
- Pullup empty results through delim joins as well by @Tmonster in #14920
- Fix getting named parameter type information. by @Giorgi in #14952
- Fix casting long to int via explicit cast in parquet by @carlopi in #14959
- Fix script/regression/benchmark.py rework by @carlopi in #14958
- Explicit install of pkg-config broke, removing it by @carlopi in #14965
- Improve code generation of storage and serialization version infos by @carlopi in #14947
- C API support for non-standard timestamp values by @jraymakers in #14954
- Implement Logical Compaction in Hash Join Operator by @YimingQiao in #14956
- Disable row group size bytes default initialization by @lnkuiper in #14974
- [Swift.yml] Bump to macos-14, and switch simulation targets by @carlopi in #14984
- Use IOException for failed fstat calls by @ywelsch in #14975
- Logical Sample requires child to have separate join order optimization by @Tmonster in #14969
- Properly register successful dialect runs by @pdet in #14977
- Run containerized builds requiring deprecatd ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION only on main/feature by @carlopi in #14998
- Fuzzer #3297: Nth Value Indexing by @hawkfish in #14997
- [Arrow] Filter pushdown decimal fix by @Tishj in #14995
- Support multiple function descriptions by @c-herrewijn in #14838
- Join Filter Pushdown does not push down in filters when nulls are present by @Tmonster in #14970
- [Fix] Throw on invalid MAP input in Value::MAP by @taniabogatsch in #14916
- Rely on extension-ci-tools workflow to build linux_amd64_gcc4 extensions by @carlopi in #14987
- Rework Auto-Complete To Work Based On PEG grammar by @Mytherin in #15003
- for-loop-erase bugfix in filter pushdown by @peterboncz in #15008
- Internal #861: Window Code Refactoring by @hawkfish in #15007
- Internal #3574: INTERVAL Normlisation Carries by @hawkfish in #15009
- [Arrow] Fix scan of an object providing the PyCapsuleInterface when projection pushdown is possible. by @Tishj in #14993
- [PySpark] - Add extra str functions to pyspark api by @mariotaddeucci in #14921
- [PySpark] - Add .isNull and .isNotNull methods to Column class by @mariotaddeucci in #14960
- DuckDB Arrow Non Canonical Extensions to use arrow.opaque by @pdet in #15002
- Autocomplete test fix by @Mytherin in #15019
- Add check_peg_parser to extension_entries by @carlopi in #15021
- Re-enable jemalloc on ARM by @lnkuiper in #14810
- Dynamically decide whether to do a Perfect Hash Join by @lnkuiper in #14971
- No salt for Android by @lnkuiper in #14923
- Fixup linux_arm64 extension builds by @carlopi in #15016
- Issue #14834: INTERVAL Collations by @hawkfish in #15022
- SUM(x + C) rewrite by @Mytherin in #15017
- Spell NULL with uppercase in configuration description and comments by @szarnyasg in #15006
- Force download doesn't require to do a head request by @pdet in #14979
- CSV Reader - 4 byte delimiters by @pdet in #14670
- More regression tests by @lnkuiper in #14973
- [PySpark] Add more functions such as slice, split, standard deviations, etc. by @binste in #14863
- Fix extension entries by @Mytherin in #15027
- Speed up scans of RLE compressed data by @Mytherin in #15023
- Speed up scans of Uncompressed strings by @Mytherin in #15024
- Internal #3583: INGNORE NULLS Race by @hawkfish in #15032
- [Regression.yml] Add icu, needed for external regression tests by @carlopi in #15044
- Fix internal error of list_zip with only truncate argument provided by @Damon07 in #15039
- Avoid sum rewrite for hugeint/uhugeint since it could introduce overflow errors by @Mytherin in #15040
- BarScalarFunction needs to keep track of width != string.size() by @carlopi in #15041
- Add SUM(BOOL) overload by @Mytherin in #15042
- Add virtual callback to get dependency manager to the catalog by @Mytherin in #15043
- Flip OR filter comparison if constant is on the other side by @Mytherin in #15045
- Fix #15010: in map cast only access validity when child elements were not fully converted by @Mytherin in #15046
- Various fixes for vector size = 2 CI by @Mytherin in #15047
- Add
require ram
to test runner, and use to limit distinct_grouping_tpch.test by @Mytherin in #15048 - [pystubs] Fix type of
proto
parameter infrom_substrait
methods. by @ingomueller-net in #15004 - CLI: Add -f [FILE] argument that allows execution of a file by @Mytherin in #15050
- max_temp_directory_size - print "90% of available disk space" as value if temp directory is not initialized by @Mytherin in #15057
- Interrupt query on error in
ClientContext::Query
by @Mytherin in #15058 - Turn count_if into an actual aggregate function by @Mytherin in #15061
- CLI: Add .safe_mode as a dot command as well by @Mytherin in #15064
- Pushdown inequality filters by @Mytherin in #15059
- Restore support for DEBUG_STACKTRACE by @carlopi in #15053
- Shell: Provide a summary of numbers if we are rendering only a single row by @Mytherin in #15031
- Issue #15067: Postgres Age Compatibility by @hawkfish in #15070
- add duckdb_append_value to C API by @jraymakers in #15065
- Speed up Main CI workflow by @Mytherin in #15071
- [CSV Reader] Being more flexible with unescaped quotes in quoted values. by @pdet in #15018
- IEJoin GetProgress: Normalize to 0-100 by @carlopi in #15081
- Avoid building for Python 3.7 on Windows by @carlopi in #15085
- Allow inputting a base hash in Regression workflow by @lnkuiper in #15082
- Top-N: Perform global boundary checking before doing sort-key conversion by @Mytherin in #15087
- Fix aggregate regression by @lnkuiper in #15025
- Sum Rewriter: correctly match only Sum aggregations in sum rewriter by @Mytherin in #15088
- new answers for some JOIN benchmarks by @hmeriann in #15090
- Ensure checkpoint tasks complete on IO exceptions by @ywelsch in #15089
- Internal #3615: Quantile Cursor Allocation by @hawkfish in #15102
- Issue #15056: DISTINCT Insensitive Aggregation by @hawkfish in #15066
- Bloom Filter Support in Parquet Reader/Writer by @hannes in #14597
- Dynamically push table filters from Top-N operator by @Mytherin in #15099
- Fix for #15080 cgroup v2 memory limit not being read correctly by @nickzoic in #15103
- Provide support for continuous profiling by @ywelsch in #14972
- fix bundle-library step by @samansmink in #15093
- Internal #861: Value Function SubFrames by @hawkfish in #15100
- Parquet reader: correctly reset vector in between calls to read when skipping by @Mytherin in #15107
- Do not swap
RIGHT
joins toLEFT
whenBuildProbeSideOptimizer
is disabled by @lnkuiper in #15105 - Fix CheckMarkToSemi conversion in FilterPushdown optimizer by @kryonix in #15104
- Setting
temp_directory
toNULL
should be same as setting it to''
by @lnkuiper in #15113 - Add make_date(INT) function, similar to make_timestamp(BIGINT) by @Mytherin in #15109
- Support unlimited precision in JSON by using yyjson "raw" values by @lnkuiper in #15112
- Avoid repartitioning out-of-core hash joins if very, very skewed by @lnkuiper in #15114
- Linux dockerized extensions: invoke script in right folder by @carlopi in #15122
- Operator's GetProgress to return ProgressData instead of double by @carlopi in #15084
- Skip building azure extension due to problems installing libxml by @carlopi in #15126
- Test now passes due to flexible quote on CSVs by @pdet in #15133
- Avoid cleaning up past releases if we have not just uploaded a new one by @carlopi in #15134
- Implement struct projection pushdown for JSON reads by @lnkuiper in #15116
- Update the
.clang-format
file to disable sorting includes by @Tishj in #15131 rowid
filter pushdown by @Maxxen in #15020- Parquet Reader: use aligned unpack in RleBpDecoder when possible by @Mytherin in #15106
- Fixup deployment of extensions for build_extensions_dockerized by @carlopi in #15136
- Parquet reader: rename metadata cache setting to
parquet_metadata_cache
, and avoid using it for stats by @Mytherin in #15129 - Fix
BLOB
conversion inparquet_metadata
by @lnkuiper in #15132 - Add explicit_cardinality to ParquetOptions [de]serialized fields by @carlopi in #15135
- Issue #15138: Friendlier ICU Settings by @hawkfish in #15139
- Execute test/sql/aggregate/aggregates/first_memory_usage.test_slow single threaded by @carlopi in #15146
- Store transactions that have deletes on a table with indexes by @Mytherin in #15144
- Only generate physical plan for LogicalPrepare when it is going to be used by @ywelsch in #15145
- [CSV Sniffer] Selecting file to sniff from Glob and List by @pdet in #13703
- Refactor signing linux extensions by @carlopi in #15159
- FileHandle should retain the FileOpenFlags it was opened with by @jkub in #15153
- transform modifiers in pivot with no columns by @Damon07 in #15155
- Fix wrong type append of base vector by @pdet in #15157
- Add cross-version testing CI by @carlopi in #15161
- Fix wrongly used github.ref to sha by @carlopi in #15167
- Fix Regression workflow running against itself by @lnkuiper in #15165
- Add dictionary size, and use dictionary/constant vectors in the aggregate hash table to speed up finding groups by @Mytherin in #15152
- Fix ci issues by @carlopi in #15169
- Do not assume constant comparison is compare equal by @Tmonster in #15164
- Read file in 100mb chunks in read_file/read_blob by @Maxxen in #15160
- [Compression] Add RoaringBitmap Compression by @Tishj in #14878
- bump spatial + VCPKG Commit by @Maxxen in #15158
- Fix various ci nightly by @carlopi in #15178
- Keep track of compression function in ColumnData, and add dedicated
select
call to compression function by @Mytherin in #15186 - Push dynamic Top-N filters for VARCHAR columns as well by @Mytherin in #15188
- Issue #15069: Postgres CURRENT_XXX Compatibility by @hawkfish in #15125
- Add DictionaryId that can be used to uniquely identify dictionaries, and use this in the aggregate HT to cache look-ups by @Mytherin in #15196
- NULL check for invalid result by @MonkeybreadSoftware in #15194
- Rework
TableFilterType::CONSTANT_COMPARISON
to work identically to constant comparisons in SQL by @Mytherin in #15197 - Fix spelling mistakes in some comments by @tomhanks2024 in #14982
- Move httpfs to external repository by @carlopi in #14727
- Review no_extension_autoloading requires in tests: either remove, add FIXME or add EXPECTED by @carlopi in #15191
- Fix for row-id pushdown, and remove unnecessarily complicated method by @Mytherin in #15216
- Fix deserialization of approx decimal quantile aggregate by @Mytherin in #15215
- LoadInfo::Copy needs to copy the version by @Mytherin in #15214
- RE2: reduce unnecessary allocations in BitState by @Mytherin in #15210
- Increase stale bot's timeout by @szarnyasg in #15224
- [Python][Dev] Remove noisy / faulty test by @Tishj in #15220
- Add dedicated
filter
method to compression algorithms by @Mytherin in #15209 - [Python][Dev] Fix
test_filter_pushdown.py
by @Tishj in #15225 - Rework Wasm extensions CI, and use out_of_tree_extensions.cmake by @carlopi in #15223
- Fix #15221: use TryCast when converting Parquet stats - and fallback to not having stats by @Mytherin in #15233
- Fix #15177 - detect corruption in dictionary compressed strings by @Mytherin in #15227
- Fix #15175 - use case-insensitive comparison when de-duplicating hive columns from files by @Mytherin in #15228
- CLI: only render large numbers if ALL values are large numbers by @Mytherin in #15229
- Fix several fuzzer issues by @Mytherin in #15226
- In Clause Rewriter: add mark column to the filter projection map to avoid projecting it upwards which can cause issues with set operations by @Mytherin in #15213
- Skip spatial on MinGW, given otherwise mingw extensions CI will fail by @carlopi in #15237
- Add filter on repository_dispatch to Regression nightly run by @carlopi in #15241
- Rework upload ci via reusable workflow by @carlopi in #15243
- Bump excel extension by @Maxxen in #15222
OR
/IN
filter pushdown forVARCHAR
by @lnkuiper in #15219- Issue #15054: Windowed Aggregate Macros by @hawkfish in #15181
- feat: Add CHECK expression to error message on constraint failure by @rustyconover in #15148
- Linenoise: make Ctrl+G execute the query by @Mytherin in #15244
- Fix #15072 and #15073: propagate aliases correctly in * SIMILAR TO, and forbid
RENAME
as well by @Mytherin in #15247 - Fix #15051: support ORDER BY rowid in ARRAY by @Mytherin in #15248
- CLI: improve quote handling in syntax highlighting of errors and don't throw in shell renderer by @Mytherin in #15249
- Histogram: convert decimals to doubles for histogram binning function by @Mytherin in #15250
- Revert "RE2: reduce unnecessary allocations in BitState" by @Mytherin in #15252
- Grouped aggregation performance improvements by @lnkuiper in #15251
- Add run_benchmark.py script by @Mytherin in #15253
- Fix #15012: transform large literals in the range of > HUGEINT_MAX < UHUGEINT_MAX to uhugeint by @Mytherin in #15255
- Clang-tidy fixes in Parquet writer by @Mytherin in #15256
- Fix for VERIFY_VECTOR=Dictionary in Window Bounds by @Mytherin in #15257
- Fix #15239: detect subqueries in lateral join conditions and throw an explicit error when encountered by @Mytherin in #15260
- Implement #14513: implement support for (a, b) IN (SELECT ...) for uncorrelated subqueries by @Mytherin in #15259
- Various nightly CI fixes by @Mytherin in #15258
- Fix #15261: turn AuxInfo into a safe optional_ptr instead of a raw pointer, and check whether or not enum is complete in pivot by @Mytherin in #15263
- Fix for JSONSerializer of BLOB by @Mytherin in #15274
- Do not exclude nulls when multiple mark join conditions by @Tmonster in #15275
- [Python] Fix an issue with double quotes in
getattr
of DuckDBPyRelation by @Tishj in #15277 - Fixup extension selection by @carlopi in #15272
- [Fix] Bind the ALTER TABLE ADD PK code into the duck catalog by @taniabogatsch in #15231
- Realnest HEP benchmarks by @hmeriann in #14468
- Feature #12699: XXX_VALUE Secondary Sorts by @hawkfish in #15270
- Join order bugs by @Tmonster in #15230
- Various nightly CI test fixes by @Mytherin in #15273
- Bugfixes by @lnkuiper in #15276
- Ungrouped aggregate gets cardinality of 1 by @jeewonhh in #15279
- More nightly tidy fixes by @Mytherin in #15280
- bump vss by @Maxxen in #15283
- Fix #15183: correctly handle NULL values in generic GREATEST implementation by @Mytherin in #15287
- Issue #15246: Negative Nanosecond Timestamps by @hawkfish in #15289
- Don't decode special characters on redirect by @lcostantino in #15101
- remove O_SYNC from O_DIRECT by @jkub in #15294
- Fix #14938: when combining ENUM with SQLNULL/UNKNOWN types, preserve the ENUM type by @Mytherin in #15297
- allow positional access in named structs by @peterboncz in #15151
- Fix 2 unconnected small problems in CI by @carlopi in #15304
- Add value creation and accessor functions to the C API for VARINT, DECIMAL, BIT, and UUID by @jraymakers in #15212
- Only create information_schema/pg_catalog catalogs in the system catalog by @Mytherin in #15286
- Add
get_partition_stats
callback to TableFunction to get a list of all row group metadata, and use this to speed upCOUNT(*)
by @Mytherin in #15301 - Fix for bitstring_agg on empty result - only complain about missing stats when we actually process rows by @Mytherin in #15305
- Correctly propagate child-types in MAP to the internal struct values, and test httpfs in LATEST_STORAGE by @Mytherin in #15303
- remove conditional around fsync in single_file_block_manager by @jkub in #15306
- Move away from upload-artifacts@v3 / download-artifacts@v3 by @carlopi in #15309
- Fix update_extensions_ci test by @carlopi in #15310
- Addressing over-eager constraint checking with delete indexes by @taniabogatsch in #15092
- Fix internal issue #3740 by @hannes in #15320
- EXPLAIN/EXPLAIN ANALYZE - limit max lines of each extra info element, instead of truncating the entire node by @Mytherin in #15317
- Minor nightly test fixes by @Mytherin in #15318
- Bump Extension C API to stable by @samansmink in #14992
- Pass down DUCKDB_EXTENSION_SIGNING_PK as env by @carlopi in #15324
- Bump to latest sqlsmith and re-enable wasm by @carlopi in #15323
- Skipping lookups in
GroupedAggregateHashTable
if (almost) everything is unique by @lnkuiper in #15321 - Add automatic sampling regression fix 2 by @Tmonster in #14914
- [Dev] Fix Roaring compression bug on appending small vectors by @Tishj in #15326
- Fix JSON reader hang by @lnkuiper in #15328
- [Dev] Clean up
Dictionary
compression code by @Tishj in #15300 - Adjustments on test to bypass sniffing limitation on vector_size by @pdet in #15330
- Enable stack traces by default, split into getting the frame pointers and resolve symbols only when the error is finalized, and add support for demangling by @Mytherin in #15337
- Use correct element rename_list_el in grammar by @Mytherin in #15339
- Unified use of constant MainHeader: FLAG_COUNT by @guoxiangCN in #15338
- Append default to appender by @Giorgi in #15121
- add core functions make_timestamp_ns(nanos) and epoch_ns(timestamp_ts) by @andreimatei in #14930
- feat: support create_on_conflict in create_table_relation by @scgkiran in #15245
- Fix error message checking in test concurrent index by @Mytherin in #15340
- CI: Use mirror for Spark binaries by @szarnyasg in #15372
- Fix skip CSV Rejects test by @pdet in #15359
- Vectorize lookups in
GroupedAggregateHashTable
by @lnkuiper in #15368 - Bump azure and remove patches by @carlopi in #15382
- Fix conditional jump or move depends on uninitialised value(s) by @pdet in #15367
- Start encapsulating
BaseExpression
by @Maxxen in #15360 - [Python] Allow use of
DuckDBPyType
as child objects in implicit conversions by @Tishj in #15346 - [Dev] Made
reference<CompressionFunction> function
private inColumnSegment
by @Tishj in #15347 - [Dev] Fix erroneous assert in
ZSTD
scan forLogicalTypeId::VARCHAR
by @Tishj in #15357 - [Dev] Reset to the vector cache so the vectors are clean for the scan by @Tishj in #15383
- Fix tests not to use compatibility version latest by @carlopi in #15361
- Fix Test introduced by new sampling by @Tmonster in #15378
- Feature #12699: RANK Secondary Sorts by @hawkfish in #15331
- [Fix] Uninitialised values in list_reverse by @taniabogatsch in #15387
- [Dev] Check in
insert
if the InsertionOrderPreservingMap contains the key, do nothing in that case by @Tishj in #15385 - AFL++ Fuzzer Tests and Fixes by @pdet in #15329
- Fix RelationStatisticsHelper to estimate table filters correctly by @Tmonster in #15308
- [PySpark] - Add broadcast function by @mariotaddeucci in #15037
- feat: refactor getting tie_break_offset in SelectBestMatch by @stephaniewang526 in #15235
- Added dashes to test case csv_buffer_size_rejects.test_slow by @hannes in #15398
- [Dev] Split last part of
ColumnDataCheckpointer::Checkpoint
intoFinalizeCheckpoint
by @Tishj in #15388 - Fix JSON reader hang found by fuzzer by @lnkuiper in #15397
- Better partition selection for external hash joins by @lnkuiper in #15389
- fix arm extensions ci by @samansmink in #15400
- Feature #12699: ROW_NUMBER Secondary Sorts by @hawkfish in #15403
- Improve hash combining by @lnkuiper in #15408
- allow multifilereaders to delete entire chunks in FinalizeChunk by @samansmink in #15401
- Fix issue #14659 by @pdet in #15411
- Fix for issue #14648 by @pdet in #15409
- Re-enable some tests, removing
mode skip
or moving it later by @carlopi in #15488 - [Fix] Adjust reclaim space test to smaller block size nightly by @taniabogatsch in #15414
- Feature #12699: CUME_DIST Secondary Sorts by @hawkfish in #15413
- Fix issue with cleanup of buffers when reading same file multiple times by @pdet in #15358
- [Fix] Track correct allocation size of evicted memory by @taniabogatsch in #15433
- Fix internal issue 3813 by @lnkuiper in #15427
- Exploit RFC_4180 to be more strict with newline settings by @pdet in #15426
- Adds comment to Python Object + small adjustment do sniffer with comment detection. by @pdet in #15425
- Fix more nightly test errors due to sampling by @Tmonster in #15423
- Type mismatch set operation by @Tmonster in #15422
- Making the names option of CSV Files more restrictive when reading one file. by @pdet in #15431
- [Python][Dev] Lock
mypy
at 1.13 by @Tishj in #15448 - Fix InFilter::ToString, visible via EXPLAIN ANALYZE for example by @carlopi in #15487
- Mention configuration option that avoids total string size error in error message by @soerenwolfers in #15489
- Fix the seed of RandomLocalState to be 64bit instead of 32bits by @carlopi in #15482
- Fix ADBC Leak when reusing statements by @pdet in #15475
- chore: Add physical type translations for new timestamp types by @krlmlr in #15472
- [Dev] Slight cleanup of
assert.hpp
by @Tishj in #15453 - Retain join partition order by @lnkuiper in #15460
- Use system threads for parallelism on read_csv if reading from pipe by @pdet in #15461
- C API header generation for Go bindings by @taniabogatsch in #14944
- Move InitSegment into roaring namespace (nit) by @arjenpdevries in #15495
- chore: Add header for g++15 compatibility by @krlmlr in #15509
- Functions can throw errors by @Tmonster in #15166
- Improve candidate error message and relax constraint of rfc_4180 = false on quotes by @pdet in #15371
- Implement Union By Name on read csv relation by @pdet in #15452
- Add behaviour to remove unescaped quotes of unquoted values by @pdet in #15454
- [CSV Sniffer] If a column with Time/Date/Timestamp values encounter any other value, immediately go to VARCHAR by @pdet in #15494
- Introduce 2 new platforms:
musllinux_arm64
andmusllinux_amd64
by @carlopi in #15429 - 15128: failed to bind column reference for function under unnest. by @Tmonster in #15421
- Setting descriptions grammar by @szarnyasg in #15500
- Feature #12699: LEAD/LAG Secondary Sorts by @hawkfish in #15497
- Replace funcs copies with moves in sorted_aggregate_function.cpp by @ttsugriy in #15442
- Re-enable iceberg extension by @carlopi in #15456
- Fix a binder issue with type aliases and foreign key constraints by @Tishj in #15517
- Properly ignore empty spaces after end of quotes by @pdet in #15522
- Fix window/test_window_wide_frame.test_slow after random() changes by @carlopi in #15524
- Add atomic ptr class, use in ColumnData to protect Compression function by @Mytherin in #15518
- fix broken link by @alexravenna in #15532
- Various fuzzer fixes by @Mytherin in #15531
- remove duplicate FastMem copies in binary by @xuke-hat in #15470
- [Julia] Improves Julia support for scalar UDFs by @tqml in #15430
- Update year in license file to 2025 by @szarnyasg in #15545
- [Python] Align the behavior between
sql
andexecute
for.pl()
call by @Tishj in #15537 - Don't create config folder on extension listing by @lcostantino in #15530
- Add
ExtensionTypeInfo
toExtraTypeInfo
by @Maxxen in #15373 - Fix some external join benchmark specifications by @lnkuiper in #15561
- Update progress_bar.cpp / drop DUCKDB_DISABLE_PRINT macro by @meztez in #15560
- Fix answers for benchmarks containing
random()
function by @hmeriann in #15562 - Make
max_temp_directory_size
round-trip by @carlopi in #15549 - Allow databases with table_macros to be copyable via COPY FROM DATABASE by @carlopi in #15548
- Allow a variable type
rowid
pseudocolumn in tables by @rustyconover in #14674 - Internal #3860: Deserialise Secondary Orderings by @hawkfish in #15541
- Throw IO exception on 1.1.3 database file with incorrect dependency order by @taniabogatsch in #15568
- Use ISNULL in conjunction or filters by @Tmonster in #15529
- Avoid fast fail: change defaults to run all tests in more cases by @carlopi in #15558
- Asof join adds rows in specific case by @Tmonster in #15567
- [Julia]: Auto-generate api.jl (requires duckdb v1.2?) by @tqml in #15474
- Implicit STRUCT to STRUCT cast for mismatching member names by @taniabogatsch in #15477
- make test always fail in case of internal exception by @c-herrewijn in #15569
- CI: Bump container for Android build by @szarnyasg in #15577
- Fix #15526: CTE use operator type modified by intersect_all by @flashmouse in #15575
- [Julia]: Auto-generate api.jl with new order by @tqml in #15580
- [Dev]
ColumnDataCheckpointer
can now checkpoint column data and validity data together by @Tishj in #15566 - Feature #12699: Secondary Sort Framing by @hawkfish in #15523
- [Test] More STRUCT cast tests by @taniabogatsch in #15578
- Making RFC4180=True more reestrisctive when it comes to newline delimiters by @pdet in #15581
- In PhysicalInsert call FinalFlush before merging row groups into local storage by @Mytherin in #15583
- HTTPFS test - no longer check for IS NOT NULL filter as this is no longer necessary by @Mytherin in #15585
- Clean-up stack traces on MacOS, fix demangling on Linux, and add
EXPORT_DYNAMIC_SYMBOLS
flag which enables stack traces on Linux by @Mytherin in #15587 - InternalException should only invalidate database when encountered during execution by @Mytherin in #15588
- DuckFuzz Fix on Null parameters for both read_csv and sniff_csv by @pdet in #15565
- Annotate errors in table macros with the call position of the table macro by @Mytherin in #15590
- Line dependent buffer by @pdet in #14512
- More bugfixes by @lnkuiper in #15605
- csv_scanner: fix order of evaluation of arguments to a function by @ProjectMutilation in #15609
- [Dev] Fix an unnecessary copy in Dictionary compression by @Tishj in #15594
- [Julia] Fix incorrect types by @tqml in #15612
- Update URLs by @szarnyasg in #15617
- [Compression]
Dictionary
compression data now also includes the validity data by @Tishj in #15591 - Issue #14996: Aggregate Secondary Orderings by @hawkfish in #15592
- Backport ENTRY_VISIBILITY from duckdb/extension-template-c by @carlopi in #15611
- Adjust list_reduce to use a 1-based indexing by @szarnyasg in #15614
- window: fix nullptr dereference by @ProjectMutilation in #15610
- Improve reading duplicate column names in JSON by @lnkuiper in #15615
- Build/test/distribute linux_amd64_musl core extensions by @carlopi in #15607
- Implement
simple_update
forfirst
aggregate function by @lnkuiper in #15619 - Issue #15596: Infinite Value Checks by @hawkfish in #15620
- Issue #15610: Wide Secondary Ordering by @hawkfish in #15625
- Issue template updates by @szarnyasg in #15618
- [C API] Expose the DB instance cache in the C API by @taniabogatsch in #15579
- File url scheme by @samansmink in #15563
- Fix an if statement that is always True by @cclauss in #15630
- Issue #15597: Temporal Error Messages by @hawkfish in #15635
- Parallel
union_by_name
forread_json
by @lnkuiper in #15593 - [Tidy] create index info in static function for reusability by @taniabogatsch in #15633
- [Arrow] Fix a bug related to ArrowArray lifetimes in the arrow scan code by @Tishj in #15632
- Not using Random Device in DuckDB Core by @pdet in #15540
- Initialize random_seed to silence warning on uninitialized variable by @carlopi in #15649
- Cleanup the GitHub Action for Python by @cclauss in #15643
- Added
weighted_avg
function using macro by @gropaul in #15616 - Skip tests, cleaning up known failures in CI by @carlopi in #15651
- Extension type modifier followup by @Maxxen in #15638
- Index scan on (dynamic) table filters by @taniabogatsch in #15410
- Feature #12699: Windowed Aggregate Ordering by @hawkfish in #15634
- Make MemoryStream non-copyable by @Maxxen in #15656
- Implement
DELTA_LENGTH_BYTE_ARRAY
andBYTE_STREAM_SPLIT
encodings for Parquet writer by @lnkuiper in #15653 - Arrow Extension Type to be registered in DuckDB Extensions by @pdet in #15285
- fix creating VARINT logical type in C API by @jraymakers in #15670
- Allow switching to a different catalog from a detached catalog by @jeewonhh in #15624
- Logging by @samansmink in #15119
- TableCatalogEntry instead of DuckTableEntry in TableScanBindData by @taniabogatsch in #15668
- Throw on unknown logging_storage set by @carlopi in #15681
- Remove mention to not existing logging_disabled_thread_local.benchmark by @carlopi in #15680
- [Dev] Remove the
CompressionValidity::NO_VALIDITY_REQUIRED
fromDictionary
by @Tishj in #15636 - [Dev] Fix wrong result reported by Roaring Compression
FinalAnalyze
by @Tishj in #15677 - Fix Python dictionary key is repeated by @cclauss in #15663
- Fix sign-compare compilation warning by @dentiny in #15672
- Deploy bundled static libraries for OSX arm64 and amd64 by @taniabogatsch in #15682
- Varint to varchar optimization by @Damon07 in #15521
- Nightly Fixes by @pdet in #15690
- Reduce test size so CI is less likely to fail by @lnkuiper in #15689
- Clean up temporary test directory in
run_tests_one_by_one.py
even if test segfaults by @lnkuiper in #15688 - Late Materialization Optimizer by @Mytherin in #15692
- [Fix] Make next_batch_index atomic by @taniabogatsch in #15699
- Add late_materialization_max_rows setting that allows you to configure the threshold at which late materialization is triggered by @Mytherin in #15697
- Default to BOOL on csv sniffer for files with only a header by @pdet in #15701
- DatabaseInstance's destructor: avoid throwing (and not cleaning up) by @carlopi in #15707
- Bugfixes by @lnkuiper in #15704
- Remove iceberg, again by @carlopi in #15716
- Allow shift-tab to be used to revert auto-complete suggestion, and implement SHOW [table] auto-completion by @Mytherin in #15708
- [Dev] Fix alignment issue in Roaring compression method by @Tishj in #15711
- Minor fixes by @Mytherin in #15715
- Move the DatabaseCacheEntry into the DBConfig, and set it before the constructor is called by @Mytherin in #15714
- Patching comparison operators in ICU to actually return bool by @hannes in #15700
- Preserve stack trace information when re-throwing by @NiclasHaderer in #15709
- [MultiFileReader] Extend support for column mapping from local -> global column by @Tishj in #15446
- Fix Arrow extension type Locks by @pdet in #15705
- Dont encode + on URL by @pdet in #15693
- Print an error when using "duckdb -f [file]" on a file that does not exist by @Mytherin in #15718
- Implement
parquet_version
parameter for Parquet writer by @lnkuiper in #15684 - [Testing] Temporarily skip tests by @taniabogatsch in #15727
- Add NATIVE_ARCH option to compile using -march=native, and in the CLI time queries that are send through "-c" by @Mytherin in #15726
- Remove httpfs patch by @lnkuiper in #15729
- Fix #15659: VARCHAR parameters now count as STRING_LITERAL again by @Mytherin in #15724
- Parquet reader: fix for filter on file_row_number column by @Mytherin in #15736
- Scan validity from dictionary vectors directly, and skip scanning validity when we encounter a dictionary vector by @Mytherin in #15737
- Make entries field non-nullable for Arrow map type by @samansmink in #15733
- Properly set
external
flag again inRadixPartitionedHashTable
by @lnkuiper in #15728 - Storage version 65 by @carlopi in #15702
- Enable index scan for dynamic IN filter by @taniabogatsch in #15665
- Ignore pushes to version branches by @Mytherin in #15743
- Move changes in v1.2 to main by @Mytherin in #15744
- Initialize create_index_info.catalog by @philippmd in #15738
- Feature #15717: Window GROUPS by @hawkfish in #15739
- Fetch only required columns in physical delete by @taniabogatsch in #15746
- Add duckdb secret types function by @samansmink in #15564
- First round of extension bumps by @Maxxen in #15655
- Move core_functions to use unity builds by @Mytherin in #15753
- Add
disabled_compression_methods
setting that can be used to disable certain compression methods by @Mytherin in #15754 - Add support for deserializing a list of SetOperations in the SetOperationNode by @Mytherin in #15755
- Feature #15717: Window GROUPS by @hawkfish in #15761
- Check for mark join indexes in aggregate and group by by @Tmonster in #15691
- Default end of binding to varchar and not bool in CSV Reader by @pdet in #15747
- If arrow extension is not registered, use format information instead of failing by @pdet in #15749
- Merge 1.2 into main by @Mytherin in #15769
- Fix CI for Linux Release Building by @hannes in #15748
- Merge changes in main into v1.2 by @Mytherin in #15770
- When loading LogicalDependency from a database file or WAL file, modify the catalog to the catalog that we are loading into by @Mytherin in #15767
- Fix minor DuckDB-Wasm problem with stacktraces, that would be shown twice by @carlopi in #15765
- Move the instance cache entry when configuring by @Mytherin in #15768
- nitpick: Sequence Scan -> Sequential Scan by @Mytherin in #15772
- Bundle MingW static library with the default extension configuration by @taniabogatsch in #15774
- [Fix] Fix truncate + FK internal exception and another index bug by @taniabogatsch in #15771
- Switch logging to macros by @samansmink in #15751
- Add back Iceberg extension by @carlopi in #15780
- Internal #4002: SQLite EXCLUDE Tests by @hawkfish in #15785
- Skip 3 tests, to be reviewed on a side by @carlopi in #15790
- Add MD to autoload list by @Mytherin in #15797
- Connection manager: make count available without a lock by keeping track of it with an atomic by @Mytherin in #15798
- Add
STORAGE_VERSION
option that allows you to specify the target storage version when serializing a database by @Mytherin in #15794 - Fix some memory/storage issues in CI by @lnkuiper in #15795
- Fix map_extract backwards compatability by @Maxxen in #15799
- Fixes for vsize=2 tests by @Mytherin in #15809
- Fix tests for storage 65 by @carlopi in #15807
- Enable tests using no_alternative_verify by @ywelsch in #15806
- V1.2 histrionicus by @Mytherin in #15812
- Fix dependency conflict in PK FK benchmark by @taniabogatsch in #15800
- Remove shuffle from sampling by @Tmonster in #15703
- bump inet by @Maxxen in #15804
- Fix
map_inference_threshold
issue in JSON reader by @lnkuiper in #15802 - [CI] Invert operations for Linux CLI: first deploy, then test by @carlopi in #15820
- Fixup shell & autocomplete versioning information by @carlopi in #15823
- Skip end of test/sql/storage/parallel/insert_many_compressible_batches.test_slow by @carlopi in #15814
- Attempted parquet warning fix by @Mytherin in #15827
- Issue #15758: Streaming LEAD Buffering by @hawkfish in #15834
- Removing all core code and CI related to the substrait extension by @pdet in #15810
- CSV AFL Tests by @pdet in #15805
- improve error messages for mismatching versions of extensions by @samansmink in #15829
- dbgen: correctly join threads in case an error is thrown while generating data in parallel by @Mytherin in #15840
- Do not change type of empty files, if the types were manually set by @pdet in #15841
- Fix #15760 - when a SQL value function conflicts with an alias in the WHERE clause, prefer the alias by @Mytherin in #15842
- Fix #15570: preserve alias when using bind_replace in table functions by @Mytherin in #15843
- Fix CAPI chunk tests by @pdet in #15846
- fix: Fix compiler warning for uninitialized access by @krlmlr in #15849
- Relax RFC_4180=False a bit more flexible by @pdet in #15832
- More lenient test limits by @Mytherin in #15845
- bump delta, remove patches by @samansmink in #15824
- enable autoloading for iceberg and delta for storage by @samansmink in #15822
- Fix get_current_time, today, current_date backwards compatibility by @Maxxen in #15803
- Reset buffer before allocating a new one in
ResizableBuffer
by @lnkuiper in #15838 - V1.2 histrionicus by @Mytherin in #15851
- [tpch] dbgen: Avoid throwing interrupt that can't be caught by @carlopi in #15856
- Add CI run testing also slow tests on PRs by @carlopi in #15854
- More memory for external aggregate test by @Mytherin in #15861
- Fixes for nightly tests related to the CSV Parser by @pdet in #15855
- Fix latest storage tests CI by @Mytherin in #15863
- Fix duckdb_extensions() listing by @carlopi in #15858
- Use const T& and T instead of const T&& and T&& in (de)serializer by @Mytherin in #15866
- Make tests more lenient for smaller block sizes by @Mytherin in #15872
- Remove default in MultiFileReaderColumnDefinition constructor by @Mytherin in #15871
- Fix spurious test/sql/copy/partitioned/partitioned_write_tpch.test_slow:53 error by @pdet in #15869
- BindLogicalType should return a new type, instead of modifying an existing type in-place by @Mytherin in #15868
- V1.2 histrionicus by @Mytherin in #15875
- Issue #15877: CUME_DIST Moving Frame by @hawkfish in #15878
- Nightly CI fixes by @Mytherin in #15885
- Disable the RealNest benchmark nightly by @hmeriann in #15839
- disable iceberg tests by @samansmink in #15883
- [Linux CI] Remove examples, already tested as part of OSX Release by @carlopi in #15879
- Fix fuzzer issue found by the DuckFuzzer by @pdet in #15886
- Avoid unnecessarily reading the string dictionary size when scanning uncompressed strings by @Mytherin in #15887
- GCC-4.8 fixes by @Mytherin in #15884
- Several nightly CI fixes by @Mytherin in #15889
- Merge main into v1.2 by @Mytherin in #15895
- When Deserializing, Sample Selection Vectors should be initialized to
FIXED_SAMPLE_SIZE
by @Tmonster in #15890 - Faster re-builds by @hannes in #15891
- Add missing ExpressionType::COMPARE_NOTEQUAL no arrow pushdown by @pdet in #15892
- Fix race/deadlock in
FixedSizebuffer::Get()
by @Maxxen in #15893 - Call ProcessError also for PendingQueries by @carlopi in #15899
- Removed unused variable in LoggingContext by @NiclasHaderer in #15898
- CI: Handle 'fixed on nightly' label by @szarnyasg in #15900
- CheckMagicBytes: zero initialise buffer by @carlopi in #15902
- Rename RFC_4180 to STRICT_MODE. Change default to true. Use the same option in the sniffer as the parser. by @pdet in #15896
- Fix Arrow Type Registration on Extensions by @pdet in #15901
- V1.2 histrionicus by @Mytherin in #15909
- Use Arrow extension GetType() implementation when converting Arrow arrays by @paleolimbot in #15813
Full Changelog: v1.1.3...v1.2.0