What's Changed
🚨 Breaking Changes
- Undeprecate the byte-pair-encoding APIs by @davidwendt in #21760
- [Multi-GPU Polars] Introduce Ray mode for multi-GPU cudf-polars execution by @madsbk in #21746
- Get rid of relaxed constexpr across libcudf by @PointKernel in #21703
- [Multi-GPU Polars] Use current rmm resource in SPMD mode by @madsbk in #21842
- Remove obsolete statistics infrastructure by @rjzamora in #21857
- [Multi-GPU Polars] Create engines directly instead of factory functions by @madsbk in #21898
- Handle integers in floor division and power AST operators by @mhaseeb123 in #21831
- Enforce cudf_polars
cardinality_factorandschedulerdeprecations by @mroeschke in #21988 - [Multi-GPU Polars] Unify streaming engine options by @madsbk in #21930
- [Multi-GPU Polars] Split PDSH utils into legacy and new frontend paths by @madsbk in #21941
- Remove CUDAStreamPolicy enum and simplify CUDA stream policy by @vyasr in #22086
- [Multi-GPU Polars] Bind workers to topology-local hardware by @madsbk in #22113
- [FEA] Support Multi-Output JIT Transforms by @lamarrr in #21704
- Migrate RMM usage to CCCL MR design by @bdice in #22008
- Refactor cudf-polars plugin for Polars' test suite by @madsbk in #22301
- Remove legacy Dask-based streaming backends by @madsbk in #22358
- Make RapidsMPF the default runtime for cudf_polars streaming executor by @mroeschke in #22281
- Bump minimum Polars version to 1.35 by @mroeschke in #22459
- Introduce a process-wide singleton engine for
.collect(engine="gpu")by @madsbk in #22410 - Remove cudf-polars[rapidsmpf] pip extra & numpy as a [test] dependency; add [dask] pip extra by @mroeschke in #22480
- Untangle
target_partition_sizeandbroadcast_join_limitby @rjzamora in #22411 - Replace
--executorwith extended--frontendchoices in cudf-polars benchmarks by @madsbk in #22504 - Clean up legacy test scaffolding in cudf-polars by @madsbk in #22535
- [cudf_polars] Reorganize package layout by @madsbk in #22491
- Move
collectivesmodule by @rjzamora in #22578
🐛 Bug Fixes
- Fix TypeError when gathering on empty indices by @jberg5 in #21705
- CPU-only importable pdsh benchmark file by @TomAugspurger in #21791
- Change more Rapidsmpf Shuffler.wait_on to Shuffler.wait by @mroeschke in #21798
- Add missing headers to reader_impl_chunking_utils.cu by @bdice in #21784
- Fix TPC-DS query validation failures due to nulls_last mismatch by @Matt711 in #21814
- IWYU to fix latest CCCL compilation by @vyasr in #21839
- Fix additional dictionary tests to handle unordered keys by @davidwendt in #21773
- Add missing includes for cuda::std::abs by @PointKernel in #21845
- Fix segment calculation in TPC-DS Q54 by @Matt711 in #21829
- Workaround
sum(nulls)difference between DuckDB and Polars in TPC-DS Q64 by @Matt711 in #21826 - Fix ambiguous stream constructor by @bdice in #21881
- Ensure cudf.pandas proxy object tests populate test-local type maps by @mroeschke in #21879
- Dont allow rtxpro6000 runners to pick up CI jobs by @Matt711 in #21954
- Fix union actor deadlock when input branches share a fanout by @Matt711 in #21949
- Fix expression decomposition when mixing fusable and non-fusable reductions by @Matt711 in #21822
- Fix type mismatch in groupby-count with multiple partitions by @Matt711 in #21934
- Return null instead of nan for pl.Expr.mean with rapidsmpf by @mroeschke in #21805
- Fix stream-ordering bugs related to pool streams by @vuule in #21908
- Fix
data_alloc_sizequery bugs by @rjzamora in #21955 - Skip pinned memory tests on unsupported systems by @rjzamora in #21976
- Fix null_count incorrectly marked as pointwise by @vyasr in #21995
- Add sort_keys to benchmark validation for complex sort expressions by @Matt711 in #21817
- Avoid invalid pwise join when dynamic-planning is enabled by @rjzamora in #21977
- Validate PDS-DS Queries Q24, 47, 49, 94 by @Matt711 in #22007
- Fix validation failures in TPC-DS Q70 and Q79 by @Matt711 in #21820
- Fix OOM in PDS-DS Q78 by @Matt711 in #22009
- Workaround unsupported unary function in a groupby context in PDS-DS Q94 by @Matt711 in #22013
- Fix unreachable else branch in gather bitmask logic by @eternallyproud in #21946
- Fix cuda error when sorting empty pl.concat result by @jberg5 in #21825
- Cast groupby sum of integers result to schema in cudf_polars by @mroeschke in #21990
- Fix RTX PRO 6000 Blackwell CI by @bdice in #21999
- Exclude
value_countsas a pointwise UnaryFunction in cudf_polars by @mroeschke in #22001 - Fix ast return_type_functor to handle decimal types with non-zero scale by @davidwendt in #21996
- Fix remote IO in cudf-polars pdsh benchmark by @ncclementi in #22090
- Fix deprecation warnings for set_as_build_table by @davidwendt in #22087
- Fix libcudf gather segfault in set_all_valid_null_masks by @davidwendt in #22092
- Fix stable ID for
DataFrameScanby @rjzamora in #22091 - Fix PDS-H decimal validation failures by @Matt711 in #22107
- Optimize PDS-DS Q74 by @Matt711 in #22109
- Pass BufferResource for stream lifetime in rapidsmpf integration layer by @vyasr in #22110
- Pass required
brargument toTableChunk.from_pylibcudf_tableby @pentschev in #22116 - Fix missing rapidsmpf hiding real
ImportErrorin benchmark scripts by @pentschev in #22114 - Fix partitioning metadata preservation for
GroupByby @rjzamora in #22111 - Expand CSE placeholders during HStack lowering by @rjzamora in #21796
- Fix lists::segmented_gather to return empty for empty input by @davidwendt in #22115
- Ensure insert_finished() is called on error paths for streaming collectives by @Matt711 in #22142
- Skip flaky upstream polars deadlock test by @Matt711 in #22182
- Fix CSE HStack lowering to respect with_columns semantics by @Matt711 in #22184
- CUDA 13.2 support: prefer
__syncthreads()toblock.sync()for shared memory fencing, fix compiler errors in C++ tests by @jameslamb in #22152 - Prevent memory corruption in ORC reader by @vuule in #22186
- Pin to
pyarrow<24in type checking environment by @TomAugspurger in #22230 - Pin PyArrow to <24 by @KyleFromNVIDIA in #22236
- Increase tolerance in
test_groupby_categorical_keyby @pentschev in #22249 - Set memory limit for DaskEngine by @TomAugspurger in #22242
- Revert Date casts in pdsh benchmarks by @TomAugspurger in #22232
- Fix flaky tracing test in cudf-polars by @TomAugspurger in #22012
- Pin polars version in type-checking environment by @TomAugspurger in #22256
- Fix nvbench handling of memory-resource objects by @davidwendt in #22257
- Temporarily increase max days without success to 40 days by @pentschev in #22264
- fix(cmake): exclude zstd, roaring, and cuco from install by @vyasr in #22263
- Rescale timestamp stats to the target precision in parquet predicate pushdown by @mhaseeb123 in #22166
- Multi-rank sinks: enforce directory output for streaming engines by @madsbk in #22285
- Prevent potential overflow errors in the CSV reader by @vuule in #22237
- Unsnap throws for malformed copy element by @mhaseeb123 in #22283
- Preserve LIST element field ids in Parquet output by @res-life in #22143
- Hardcode disabled network bindings by @pentschev in #22253
- Fix cudaErrorIllegalAddress in concatenate_list_elements when inner list has 0 rows by @wjxiz1992 in #22147
- Add rapidsmpf as a test dependency of py_test_cudf_polars by @Matt711 in #22316
- Fix malformed pages in PQ byte stream split decoder by @mhaseeb123 in #22280
- Revise
Sortlowering andsort_actorassumptions by @rjzamora in #22315 - Fix use-after-free of host_vector when used with cuda_memcpy_async by @davidwendt in #22321
- Fix use-after-free in
memory_stats_loggerby @PointKernel in #22333 - Fix CCCL compilation errors by @bdice in #22349
- Fix CSV reader
delim_whitespaceheader handling to match Pandas by @vuule in #22239 - Fix more use-after-free cases found in libcudf by @davidwendt in #22332
- Handle 0-row input in
ContainsAny,JsonDecode, andJsonEncodeby @madsbk in #22362 - Fix StatsCollector.serialize to use value equality instead of object identity by @Matt711 in #22366
- Pass managed pool MR explicitly in NDSH parquet data generation by @vuule in #22344
- Fix compile warnings in libcudf examples by @davidwendt in #22335
- Multi-rank fixes for cudf-polars streaming by @madsbk in #22361
- Fix reading of large CSV files (>64MB) by @vuule in #22375
- Validate PDS-DS Q1 by @Matt711 in #22389
- Fix a crash in the ORC reader with malformed stripe footers by @vuule in #22383
- Correctly handle blocks with "block byte size" fields in the Avro reader by @vuule in #22387
- Fix
to_arrayto return non-corrupted data by @galipremsagar in #22342 - Use thread pool to submit hybrid scan host IO tasks by @mhaseeb123 in #21992
- Fix pdsh script dropping records by @galipremsagar in #22412
- Handle sign-extension while decoding Parquet decimal stats by @pramodsatya in #22402
- Fix MERGE_M2 for extreme finite partial means by @wjxiz1992 in #22393
- [JAVA] Fix ColumnWriterOptions parquet field id placement on outer list/binary/map by @res-life in #22422
- Align pdsh benchmarks and library defaults by @TomAugspurger in #22399
- Zero-initialize
is_quoted_flagsbuffer in the CSV reader by @vuule in #22386 - Include find_package_root everywhere it's used by @vyasr in #22460
- Fix race condition in page header decoder by @mhaseeb123 in #22458
- Deprecate the multi-patterns cudf::strings::replace_re API by @davidwendt in #22380
- Move replicated-output dedup to the Dask and Ray frontends by @Matt711 in #22394
- Fix assertion failures in
assert_tpch_result_equaldue to float sort ambiguity by @Matt711 in #22378 - Validate TPC-DS Q8 by @Matt711 in #22473
- Revert PR #22490 (Split PR devcontainer CI into pip and conda jobs) by @bdice in #22497
- Fix ORC reader 1-second error for negative timestamps with non-UTC writer timezone by @vuule in #22179
- Fix
to_cupy(dtype=...)on non-numeric columns by @galipremsagar in #22485 - Avoid allocating over the batch size limit in the JSON reader by @vuule in #22481
- Guarantee
insert_finished()on bulk AllGather by @Matt711 in #22516 - Fix potential malformed headers in parquet delta decoder by @mhaseeb123 in #22275
- Fix JSON reader guards for scatter validity, validation, and max nesting depth by @karthikeyann in #22452
- [release/26.06] Remove stale import of
deleted assert_collect_raisesby @madsbk in #22558 - Fix use-after-destroy and stream ordering in Parquet IO utils by @mhaseeb123 in #22529
- Serialize engine config in new pdsh benchmark CLI by @TomAugspurger in #22572
- Fix memcheck error in json checked-token-level utility by @davidwendt in #22571
- Patch Arrow to set CMAKE_POLICY_VERSION_MINIMUM for RapidJSON by @KyleFromNVIDIA in #22582
- Remove unnecessary max token count check in JSON tokenizer by @shrshi in #22589
- Fix silent row drops in multi-GPU joins with computed key expressions by @Matt711 in #22318
- Adapt ast conversion for literals by @wence- in #22623
- Backport #22551 by @wence- in #22636
- Fix default
target_partition_sizesingleton engine by @rjzamora in #22638
📖 Documentation
- Temporarily
nitpick_ignore_regexpandas sphinx references by @mroeschke in #21774 - Fix Doxygen
@paramentries in/srcby @vuule in #21764 - Fix Doxygen
@paramentries in/includeby @vuule in #21762 - Add developer guideline for constexpr and device code by @PointKernel in #21965
- Update dictionary section in developer guide by @davidwendt in #21979
- Fix apostrophe in CHANGELOG.md by @davidwendt in #22055
- Fix doxygen format for contains and datetime functions by @davidwendt in #22151
- Fix associativity example in pandas-comparison docs by @vyasr in #22196
- Overhaul cudf-polars docs for new streaming multi-GPU engines by @madsbk in #22252
- Update cudf-polars benchmarks for new default engine by @btepera in #22619
🚀 New Features
- Create public libcudf gather API with a negative_index_policy parameter by @davidwendt in #21739
- Add join selectivity benchmarks by @PointKernel in #21775
- Parquet readers support case-insensitive column names by @mhaseeb123 in #21700
- Add agent skills for cudf by @mhaseeb123 in #21737
- Add JIT cache management functions to pylibcudf by @Matt711 in #21795
- Add a
rebind_streamAPI to set streams for all buffers in a column by @vuule in #21940 - Add
sort_actorto cudf-polars + rapidsmpf by @rjzamora in #21690 - Add hybrid scan API to construct row group passes by @mhaseeb123 in #21895
- Add support for ignorecase flag in regex functions by @davidwendt in #21861
- Add pre-filtering support for mark join by @PointKernel in #21865
- Fuse drop_nulls into n_unique by @vyasr in #22014
- Upgrade to nvcomp 5.2.0.10 (and 5.2.0.13 for wheels) by @bdice in #22127
- Support null_count decomposition in multi-partition Select by @Matt711 in #22126
- Add API to count number of deleted rows across deletion vector(s) by @mhaseeb123 in #21963
- Expose getters for ColumnWriterOptions isBinary and Parquet field id by @res-life in #22188
- Implement cudf roaring bitmap by @mhaseeb123 in #22133
- Python bindings for hybrid scan API to construct row group passes by @mhaseeb123 in #21918
- Support multi-partition groupby variance / standard deviation aggregations by @Matt711 in #21962
- Implement
cudf::apply_deletion_maskAPI by @mhaseeb123 in #22144 - Add cudf::strings::count API for literal strings by @davidwendt in #22288
- Add CodeRabbit configuration and AI review guidelines by @bdice in #22176
- JNI bindings for
strings::contains(column)by @mythrocks in #22003 - Support for
cudf::strings::replace()where thetargetsandreplsare columns by @mythrocks in #22132 - Add skip axis to all join benchmarks by @PointKernel in #22241
- Add decimal128 to groupby_max_cardinality benchmark by @PointKernel in #22162
- Python bindings and pytests for
cudf::apply_deletion_maskby @mhaseeb123 in #22145 - Add basic support for VARIANT type to the Parquet reader by @vuule in #22310
- Add
LocalRepartitionerutility by @rjzamora in #22439 - Add arrow bloom filter policy by @PointKernel in #22415
- JNI support for SUM_WITH_OVERFLOW aggregation by @mythrocks in #22404
- Implement streaming window functions in cudf-polars by @Matt711 in #22191
- Add
streaming_groupbyfor stateful streaming aggregation by @PointKernel in #21924 - Add partitioned probe support for hash joins by @PointKernel in #22108
🛠️ Improvements
- Adapt to rapidsmpf async shuffle changes by @wence- in #21787
- Remove CSV reader warnings emitted in unit tests by @vuule in #21794
- Remove test_infer_objects_no_reference from cudf_pandas xfail list by @mroeschke in #21818
- Change to use non-detail APIs in some libcudf benchmarks by @davidwendt in #21821
- Deduplicate libcudf examples CMake files by @mhaseeb123 in #21809
- Merge release/26.04 into main by @davidwendt in #21815
- Support drop_nulls unary function in expression decomposition by @quasiben in #21837
- [Multi-GPU Polars] Ray mode in PDSH benchmarks by @madsbk in #21811
- Improve build time using transform instead of tabulate by @davidwendt in #21793
- Use conda packages instead of pip packages in test_narwhals & remove xpassing test_series_setitem from pandas tests by @mroeschke in #21862
- Ensure Lineariser channels in scan_node are shutdown on error in cudf_polars by @mroeschke in #21854
- Add missing includes for
<cuda/functional>and<cuda/iterator>by @bdice in #21859 - [Multi-GPU Polars] SPMD mode works without
rrunby @madsbk in #21851 - cudf-polars tracing improvements by @TomAugspurger in #21789
- Fix utf8-to-codepoint utility to handle out-of-range unicode by @davidwendt in #21823
- Use
PyBuffer_FillInfoforHostBuffer's buffer by @jakirkham in #21855 - Fix deprecation warning in JNI for parquet_reader_options::builder.names() by @davidwendt in #21868
- Add noarch Python channel to cudf.pandas third party tests conda solve by @mroeschke in #21873
- [Multi-GPU Polars] Introduce
StreamingEnginebase class andSPMDEngineby @madsbk in #21867 - Rewrite TPC-DS Q14 plan to workaround Polars optimizer CSE limitation by @quasiben in #21885
- Fix CPU PDS* runs by @quasiben in #21899
- Remove unneeded CUDF_EXPORT from some cudf/detail headers by @davidwendt in #21693
- Increase cpp-memcheck test timeout by @davidwendt in #21901
- Run cudf_polars unit tests with RapidsMPF by @mroeschke in #21807
- Maintain column sorted metadata in result groupby keys in cudf_polars by @mroeschke in #21871
- Consolidate/simplify cudf.pandas unit testing script by @mroeschke in #21892
- Add scoped_range to cudf::benchmark for nvtx ranges in benchmarks by @davidwendt in #21902
- Fix mypy pinning on rmm (26.06) by @bdice in #21935
- Main release/26.04 into main by @mroeschke in #21958
- Remove paths from cudf-polars Scan Trace properties by @TomAugspurger in #21927
- Use pip index to fetch polars versions in CI scripts by @mroeschke in #21950
- Explicitly cancel outstanding tasks in
fanout_node_unboundedby @mroeschke in #21853 - Ensure nodes are
del'd during errors inrun_actor_networkin cudf_polars by @mroeschke in #21850 - Ensure rapidsmpf RmmResourceAdaptor is unset after actor network is run in cudf_polars by @mroeschke in #21856
- [cudf_polars] Enabling pinned memory for pdsh runs by @nirandaperera in #21880
- Optimize Polars TPC-DS q50 implementation by @beckernick in #21884
- Use public gather in libcudf benchmarks and gtests by @davidwendt in #21903
- Support dynamic error messages in CUDF_EXPECTS and CUDF_FAIL macros by @kingcrimsontianyu in #21900
- Remove unconditional large left table skip in mark join benchmarks by @PointKernel in #21909
- [cudf_polars] [MINOR] Configurable dask worker memory in benchmarks by @nirandaperera in #21972
- Use asyncio.TaskGroup instead of .gather in cudf_polars by @mroeschke in #21858
- Replace thrust counting iterators with cuda::counting_iterator by @PointKernel in #21718
review-cudfskill checks for functions defined in headers by @mhaseeb123 in #21971- Expose ability to hash_partition based on a separate key table by @wence- in #21730
- Use ruff to disallow asyncio.gather in cudf_polars by @mroeschke in #21985
- Remove parquet statistics filter validation logic by @Matt711 in #21736
- Remove unneeded include of some detail headers in non-internal libcudf code by @davidwendt in #21907
- Support regex named capture groups in contains, count, match, findall by @davidwendt in #21848
- [WIP] Better path handling for s3 by @quasiben in #22005
- Remove cudf::detail::target_type from groupby gtests by @davidwendt in #21932
- Clean up log messages in Parquet and ORC unit tests by @vuule in #21797
- Optimize TPC-DS query plans for streaming executor (q4, q23, q64, q75, q78) by @vyasr in #22011
- [Multi-GPU Polars] Introduce a new Dask frontend by @madsbk in #21812
- Disable "native" rapidsmpf parquet reader by default by @rjzamora in #22023
- Pass memory resource explicitly to remove implicit default mr usage (Part 2) by @karthikeyann in #22029
- Pass memory resource explicitly to remove implicit default mr usage (Part 3) by @karthikeyann in #22030
- Pass memory resource explicitly to remove implicit default mr usage (Part 1) by @karthikeyann in #22028
- [Multi-GPU Polars] Fix SPMD bootstrap race with session-scoped communicator by @madsbk in #22015
- Split hash join definitions to reduce build time by @PointKernel in #21804
- Deduplicate parquet pass construction by @mhaseeb123 in #21923
- Remove "using namespace cudf" from row_ir jit gtest by @davidwendt in #22069
- Deprecate build side option for filtered_join by @PointKernel in #21982
- Pass memory resource to exec_policy_nosync in lists module by @bdice in #22041
- Pass memory resource to exec_policy_nosync in groupby module by @bdice in #22038
- Remove more unneeded detail header includes from libcudf benchmarks by @davidwendt in #22068
- [Multi-GPU Polars] New frontends accept RMM config by @madsbk in #22052
- Avoid storing all executor options in
StreamingSinkby @rjzamora in #22079 - Update to clang 20.1.8 by @bdice in #22093
- Use cudf::test::iterator utilities instead of make_counting_transform_iterator as appropriate by @davidwendt in #22071
- Adopt workflow dispatch pattern for compute sanitizer workflows by @davidwendt in #22054
- [Multi-GPU Polars] Remove all
pytest.skipcalls for unavailable GPU. by @madsbk in #22099 - Optimize TPC-DS query plans for streaming executor (q80, q31, q11) by @vyasr in #22070
- Preserve partitioning metadata for
HStacknodes by @rjzamora in #22103 - Pass memory resource to exec_policy_nosync in join module by @bdice in #22039
- Fix unsanitized nulls from strings_column_wrapper inputs in gtests by @davidwendt in #22088
- Use simpler iterators instead of make_counting_transform_iterator by @davidwendt in #22119
- Remove redundant aggregation identity logic in shared memory groupby by @PointKernel in #22010
- Optimize TPC-DS query plans for streaming executor (q9, q74) by @vyasr in #22121
- Add more AST gtests for supported decimal operations by @davidwendt in #22097
- Rename GroupedRollingWindow as GroupedWindow by @Matt711 in #22135
- Pass memory resource to exec_policy_nosync in io module by @bdice in #22035
- Pass memory resource to exec_policy_nosync in text module by @bdice in #22037
- Pass memory resource to exec_policy_nosync in reductions and quantiles modules by @bdice in #22040
- Pass memory resource to exec_policy_nosync in copying, rolling, and merge modules by @bdice in #22043
- Pass memory resource to exec_policy_nosync in dictionary, interop, and replace modules by @bdice in #22044
- Pass memory resource to exec_policy_nosync in sort, search, stream_compaction, and partitioning modules by @bdice in #22042
- Pass memory resource to exec_policy_nosync in strings module by @bdice in #22036
- Remove clang-format-off from nth_element_tests.cpp by @davidwendt in #22100
- Add multiple_of utilities to cudf::test::iterators by @davidwendt in #22078
- [Multi-GPU Polars] Add
--num-gpusfor the benchmarks by @madsbk in #22149 - [Multi-GPU Polars] Unify
num_py_executorsdefault by @madsbk in #22168 - Remove verbose from bind() in favor of exceptions in cudf-polars by @pentschev in #22169
- [Multi-GPU Polars] Add GPU sharing detection by @madsbk in #22148
- [Multi-GPU Polars] Reduce DaskEngine local cluster log verbosity by @madsbk in #22167
- Use cudf::sequence instead of make_counting_transform for large gtests columns by @davidwendt in #22106
- Add cudf-polars-codeowners to CI by @Matt711 in #22192
- Run Polars unit tests with RapidsMPF by @mroeschke in #21677
- Implement basic bloom pre-filtering in shuffle join by @wence- in #21931
- Pass BufferResource to SpillableMessages by @vyasr in #22164
- Add dedicated stream testing job by @KyleFromNVIDIA in #22150
- Automatically pin
numba-cudaupper bound at release time inupdate-version.shby @brandon-b-miller in #21533 - Enable --collect-traces with new frontends by @TomAugspurger in #22199
- Support decimal operators with different scales in AST by @davidwendt in #22122
- fix(ci): remove
needsdependency on job that doesn't exist in test yaml by @gforsyth in #22211 - Remove unused device_memory_resource includes by @bdice in #22187
- Deduplicate lower_ir_graph in cudf-polars by @TomAugspurger in #22220
- Pass
dask.datasets.timeseries(seed=)in tests by @mroeschke in #22223 - [Multi-GPU Polars] Gather statistics by @madsbk in #22210
- Clean up numba extension code generation by @brandon-b-miller in #22270
- Ignore ResourceWarning from stumpy for Python 3.14 by @mroeschke in #22272
- Update deselected polars tests by @wence- in #22265
- Skip Python/pandas versions without supported wheels by @vyasr in #22273
- Generalize
NormalizedPartitioningclass by @rjzamora in #22246 - Use the new compute-matrix workflow for stream testing by @KyleFromNVIDIA in #22240
- Agent skill to build and test cudf java by @mhaseeb123 in #21894
- Refactor DataSourceInfo by @TomAugspurger in #22254
- Record
engine_namein cudf-polars benchamrks by @TomAugspurger in #22269 - [MINOR] Remove redundant vectors in
memcpy_batch_asyncfast path by @nirandaperera in #22125 - fix(cmake): exclude kvikio from install, build static, and fix some export issues introduced in 22263 by @vyasr in #22277
- Skip two more tests by @vyasr in #22294
- Refactor cudf-polars test suite onto pytest fixtures by @madsbk in #22212
- Import nvcomp CMake configuration from rapids-cmake into cudf by @vyasr in #22306
- Reset check-nightly-ci max days without success by @davidwendt in #22303
- Lower the query plan on workers in cudf-polars by @TomAugspurger in #22287
- Resolve timezone alias links via tzdata.zi when loading transition tables by @vuule in #22293
- Skip zero-sized default pinned pool by @bdice in #22292
- RayEngine: support GPU oversubscription in tests by @madsbk in #22302
- Address cuml RandomForestClassifier(max_depth=) deprecation by @mroeschke in #22324
- Add DuckDB resource-limit options to benchmark runner by @Matt711 in #22266
- Refactor cudf-polars test fixtures away from indirect parametrization by @madsbk in #22325
- Add call to reset_current_device_resource in gtests fixtures by @davidwendt in #22267
- cudf-polars: add
RayEngine._reset()by @madsbk in #22348 StreamingEngine._reset()by @madsbk in #22364- Improve hstack lowering by @rjzamora in #22353
- Replace
LD_PRELOADhack with compute-sanitizer by @KyleFromNVIDIA in #22290 - Run all nvbench benchmarks with timeout in smoketest by @bdice in #20538
- Rename build/probe to right/left in hash_join and distinct_hash_join by @PointKernel in #22382
- Use
token.rapids.nvidia.comwhen issuing S3 bucket creds in devcontainers by @trxcllnt in #22338 - Use static cudart by default by @KyleFromNVIDIA in #22397
- Use cudaStream_t instead of cuda_stream_view in pylibcudf Cython by @vyasr in #22368
- Use
language: scriptfor cudf-polars-ir-signatures pre-commit hook by @vyasr in #22384 - Fix potential errors in Parquet page header decode by @mhaseeb123 in #22274
- Refactor
sort_actorto prepare forOrderSchemechanges by @rjzamora in #22350 - Run the cudf-polars test suite against
DaskEngineandRayEngineby @madsbk in #22381 - Move table_device_view function definitions from .cuh to .cu by @davidwendt in #22354
- Fallback to
async-mrfor the multithreaded parquet example by @mhaseeb123 in #22245 - fix(ci): resolve all zizmor findings and add zizmor pre-commit checks by @gforsyth in #22343
- Adopt
OrderSchememetadata in cudf-polars by @rjzamora in #22291 - Consolidate
evaluate_rapidsmpfintoevaluate_streamingin cudf_polars by @mroeschke in #22417 - Add ray run_constraints in cudf_polars conda recipe by @mroeschke in #22414
- Improve installation hygiene of built and header-only dependencies by @vyasr in #22341
- Run conda, cudf_polars CI tests with Ray by @mroeschke in #22420
- Support
Buffer's inHybridScanReadermethods needingbytes-like data by @jakirkham in #22345 - Implement equality of two table_views by @wence- in #22319
- fix(ci): use sha for the only allowlisted version of action-add-assignees by @gforsyth in #22453
- Update default memory resource for cudf-polars by @TomAugspurger in #22426
- fix(ci): add explicit
actions: writepermission fortelemetry-summarize
by @gforsyth in #22479 - Split PR devcontainer CI into pip and conda jobs by @bdice in #22490
- Undo some CCCL workarounds fixed in the latest update by @davidwendt in #22475
- Build and test with CUDA 13.2.0 by @bdice in #22463
- Remove
wheel-tests-cudf-polars-with-rapidsmpfin favor of existingwheel-tests-cudf-polarsby @mroeschke in #22467 - Remove anonymous namespaces from cudf headers by @PointKernel in #22418
- Relax
NormalizedPartitioning.from_keysby @rjzamora in #22483 - Reduce peak footprint of cudf-polars test memory usage by @wence- in #22493
- Use basic
OrderSchememetadata insort_actorby @rjzamora in #22477 - Add
pinned_max_pool_sizeandunbounded_file_read_cachetoStreamingOptionsby @madsbk in #22501 - Implement our own to_thread offload for cudf-polars streaming execution by @wence- in #22474
- Remove
--broadcast-join-limitby @rjzamora in #22499 - Revert
FD_GROUPBY_REWRITEin TPC-DS benchmark queries by @Matt711 in #22525 - Bump polars upper bound to <1.40 by @Matt711 in #22048
- More Polars plan optimizations for TPC-DS by @Matt711 in #22395
- Remove bad algorithmic behaviour when reserving collective IDs by @wence- in #22604
- Add configuration hints when
run_actor_networkraises a memory error by @rjzamora in #22561 - Simplify validation in cudf-polars benchmark by @wence- in #22600
- skip CuPy 14.1.0 by @jameslamb in #22702
New Contributors
- @eternallyproud made their first contribution in #21946
- @pramodsatya made their first contribution in #22402
Full Changelog: v26.06.00a...v26.06.00