github rapidsai/cudf v26.06.00

latest release: v26.06.01
6 hours ago

What's Changed

🚨 Breaking Changes

  • Undeprecate the byte-pair-encoding APIs by @davidwendt in #21760
  • [Multi-GPU Polars] Introduce Ray mode for multi-GPU cudf-polars execution by @madsbk in #21746
  • Get rid of relaxed constexpr across libcudf by @PointKernel in #21703
  • [Multi-GPU Polars] Use current rmm resource in SPMD mode by @madsbk in #21842
  • Remove obsolete statistics infrastructure by @rjzamora in #21857
  • [Multi-GPU Polars] Create engines directly instead of factory functions by @madsbk in #21898
  • Handle integers in floor division and power AST operators by @mhaseeb123 in #21831
  • Enforce cudf_polars cardinality_factor and scheduler deprecations by @mroeschke in #21988
  • [Multi-GPU Polars] Unify streaming engine options by @madsbk in #21930
  • [Multi-GPU Polars] Split PDSH utils into legacy and new frontend paths by @madsbk in #21941
  • Remove CUDAStreamPolicy enum and simplify CUDA stream policy by @vyasr in #22086
  • [Multi-GPU Polars] Bind workers to topology-local hardware by @madsbk in #22113
  • [FEA] Support Multi-Output JIT Transforms by @lamarrr in #21704
  • Migrate RMM usage to CCCL MR design by @bdice in #22008
  • Refactor cudf-polars plugin for Polars' test suite by @madsbk in #22301
  • Remove legacy Dask-based streaming backends by @madsbk in #22358
  • Make RapidsMPF the default runtime for cudf_polars streaming executor by @mroeschke in #22281
  • Bump minimum Polars version to 1.35 by @mroeschke in #22459
  • Introduce a process-wide singleton engine for .collect(engine="gpu") by @madsbk in #22410
  • Remove cudf-polars[rapidsmpf] pip extra & numpy as a [test] dependency; add [dask] pip extra by @mroeschke in #22480
  • Untangle target_partition_size and broadcast_join_limit by @rjzamora in #22411
  • Replace --executor with extended --frontend choices in cudf-polars benchmarks by @madsbk in #22504
  • Clean up legacy test scaffolding in cudf-polars by @madsbk in #22535
  • [cudf_polars] Reorganize package layout by @madsbk in #22491
  • Move collectives module by @rjzamora in #22578

🐛 Bug Fixes

📖 Documentation

🚀 New Features

🛠️ Improvements

  • Adapt to rapidsmpf async shuffle changes by @wence- in #21787
  • Remove CSV reader warnings emitted in unit tests by @vuule in #21794
  • Remove test_infer_objects_no_reference from cudf_pandas xfail list by @mroeschke in #21818
  • Change to use non-detail APIs in some libcudf benchmarks by @davidwendt in #21821
  • Deduplicate libcudf examples CMake files by @mhaseeb123 in #21809
  • Merge release/26.04 into main by @davidwendt in #21815
  • Support drop_nulls unary function in expression decomposition by @quasiben in #21837
  • [Multi-GPU Polars] Ray mode in PDSH benchmarks by @madsbk in #21811
  • Improve build time using transform instead of tabulate by @davidwendt in #21793
  • Use conda packages instead of pip packages in test_narwhals & remove xpassing test_series_setitem from pandas tests by @mroeschke in #21862
  • Ensure Lineariser channels in scan_node are shutdown on error in cudf_polars by @mroeschke in #21854
  • Add missing includes for <cuda/functional> and <cuda/iterator> by @bdice in #21859
  • [Multi-GPU Polars] SPMD mode works without rrun by @madsbk in #21851
  • cudf-polars tracing improvements by @TomAugspurger in #21789
  • Fix utf8-to-codepoint utility to handle out-of-range unicode by @davidwendt in #21823
  • Use PyBuffer_FillInfo for HostBuffer's buffer by @jakirkham in #21855
  • Fix deprecation warning in JNI for parquet_reader_options::builder.names() by @davidwendt in #21868
  • Add noarch Python channel to cudf.pandas third party tests conda solve by @mroeschke in #21873
  • [Multi-GPU Polars] Introduce StreamingEngine base class and SPMDEngine by @madsbk in #21867
  • Rewrite TPC-DS Q14 plan to workaround Polars optimizer CSE limitation by @quasiben in #21885
  • Fix CPU PDS* runs by @quasiben in #21899
  • Remove unneeded CUDF_EXPORT from some cudf/detail headers by @davidwendt in #21693
  • Increase cpp-memcheck test timeout by @davidwendt in #21901
  • Run cudf_polars unit tests with RapidsMPF by @mroeschke in #21807
  • Maintain column sorted metadata in result groupby keys in cudf_polars by @mroeschke in #21871
  • Consolidate/simplify cudf.pandas unit testing script by @mroeschke in #21892
  • Add scoped_range to cudf::benchmark for nvtx ranges in benchmarks by @davidwendt in #21902
  • Fix mypy pinning on rmm (26.06) by @bdice in #21935
  • Main release/26.04 into main by @mroeschke in #21958
  • Remove paths from cudf-polars Scan Trace properties by @TomAugspurger in #21927
  • Use pip index to fetch polars versions in CI scripts by @mroeschke in #21950
  • Explicitly cancel outstanding tasks in fanout_node_unbounded by @mroeschke in #21853
  • Ensure nodes are del'd during errors in run_actor_network in cudf_polars by @mroeschke in #21850
  • Ensure rapidsmpf RmmResourceAdaptor is unset after actor network is run in cudf_polars by @mroeschke in #21856
  • [cudf_polars] Enabling pinned memory for pdsh runs by @nirandaperera in #21880
  • Optimize Polars TPC-DS q50 implementation by @beckernick in #21884
  • Use public gather in libcudf benchmarks and gtests by @davidwendt in #21903
  • Support dynamic error messages in CUDF_EXPECTS and CUDF_FAIL macros by @kingcrimsontianyu in #21900
  • Remove unconditional large left table skip in mark join benchmarks by @PointKernel in #21909
  • [cudf_polars] [MINOR] Configurable dask worker memory in benchmarks by @nirandaperera in #21972
  • Use asyncio.TaskGroup instead of .gather in cudf_polars by @mroeschke in #21858
  • Replace thrust counting iterators with cuda::counting_iterator by @PointKernel in #21718
  • review-cudf skill checks for functions defined in headers by @mhaseeb123 in #21971
  • Expose ability to hash_partition based on a separate key table by @wence- in #21730
  • Use ruff to disallow asyncio.gather in cudf_polars by @mroeschke in #21985
  • Remove parquet statistics filter validation logic by @Matt711 in #21736
  • Remove unneeded include of some detail headers in non-internal libcudf code by @davidwendt in #21907
  • Support regex named capture groups in contains, count, match, findall by @davidwendt in #21848
  • [WIP] Better path handling for s3 by @quasiben in #22005
  • Remove cudf::detail::target_type from groupby gtests by @davidwendt in #21932
  • Clean up log messages in Parquet and ORC unit tests by @vuule in #21797
  • Optimize TPC-DS query plans for streaming executor (q4, q23, q64, q75, q78) by @vyasr in #22011
  • [Multi-GPU Polars] Introduce a new Dask frontend by @madsbk in #21812
  • Disable "native" rapidsmpf parquet reader by default by @rjzamora in #22023
  • Pass memory resource explicitly to remove implicit default mr usage (Part 2) by @karthikeyann in #22029
  • Pass memory resource explicitly to remove implicit default mr usage (Part 3) by @karthikeyann in #22030
  • Pass memory resource explicitly to remove implicit default mr usage (Part 1) by @karthikeyann in #22028
  • [Multi-GPU Polars] Fix SPMD bootstrap race with session-scoped communicator by @madsbk in #22015
  • Split hash join definitions to reduce build time by @PointKernel in #21804
  • Deduplicate parquet pass construction by @mhaseeb123 in #21923
  • Remove "using namespace cudf" from row_ir jit gtest by @davidwendt in #22069
  • Deprecate build side option for filtered_join by @PointKernel in #21982
  • Pass memory resource to exec_policy_nosync in lists module by @bdice in #22041
  • Pass memory resource to exec_policy_nosync in groupby module by @bdice in #22038
  • Remove more unneeded detail header includes from libcudf benchmarks by @davidwendt in #22068
  • [Multi-GPU Polars] New frontends accept RMM config by @madsbk in #22052
  • Avoid storing all executor options in StreamingSink by @rjzamora in #22079
  • Update to clang 20.1.8 by @bdice in #22093
  • Use cudf::test::iterator utilities instead of make_counting_transform_iterator as appropriate by @davidwendt in #22071
  • Adopt workflow dispatch pattern for compute sanitizer workflows by @davidwendt in #22054
  • [Multi-GPU Polars] Remove all pytest.skip calls for unavailable GPU. by @madsbk in #22099
  • Optimize TPC-DS query plans for streaming executor (q80, q31, q11) by @vyasr in #22070
  • Preserve partitioning metadata for HStack nodes by @rjzamora in #22103
  • Pass memory resource to exec_policy_nosync in join module by @bdice in #22039
  • Fix unsanitized nulls from strings_column_wrapper inputs in gtests by @davidwendt in #22088
  • Use simpler iterators instead of make_counting_transform_iterator by @davidwendt in #22119
  • Remove redundant aggregation identity logic in shared memory groupby by @PointKernel in #22010
  • Optimize TPC-DS query plans for streaming executor (q9, q74) by @vyasr in #22121
  • Add more AST gtests for supported decimal operations by @davidwendt in #22097
  • Rename GroupedRollingWindow as GroupedWindow by @Matt711 in #22135
  • Pass memory resource to exec_policy_nosync in io module by @bdice in #22035
  • Pass memory resource to exec_policy_nosync in text module by @bdice in #22037
  • Pass memory resource to exec_policy_nosync in reductions and quantiles modules by @bdice in #22040
  • Pass memory resource to exec_policy_nosync in copying, rolling, and merge modules by @bdice in #22043
  • Pass memory resource to exec_policy_nosync in dictionary, interop, and replace modules by @bdice in #22044
  • Pass memory resource to exec_policy_nosync in sort, search, stream_compaction, and partitioning modules by @bdice in #22042
  • Pass memory resource to exec_policy_nosync in strings module by @bdice in #22036
  • Remove clang-format-off from nth_element_tests.cpp by @davidwendt in #22100
  • Add multiple_of utilities to cudf::test::iterators by @davidwendt in #22078
  • [Multi-GPU Polars] Add --num-gpus for the benchmarks by @madsbk in #22149
  • [Multi-GPU Polars] Unify num_py_executors default by @madsbk in #22168
  • Remove verbose from bind() in favor of exceptions in cudf-polars by @pentschev in #22169
  • [Multi-GPU Polars] Add GPU sharing detection by @madsbk in #22148
  • [Multi-GPU Polars] Reduce DaskEngine local cluster log verbosity by @madsbk in #22167
  • Use cudf::sequence instead of make_counting_transform for large gtests columns by @davidwendt in #22106
  • Add cudf-polars-codeowners to CI by @Matt711 in #22192
  • Run Polars unit tests with RapidsMPF by @mroeschke in #21677
  • Implement basic bloom pre-filtering in shuffle join by @wence- in #21931
  • Pass BufferResource to SpillableMessages by @vyasr in #22164
  • Add dedicated stream testing job by @KyleFromNVIDIA in #22150
  • Automatically pin numba-cuda upper bound at release time in update-version.sh by @brandon-b-miller in #21533
  • Enable --collect-traces with new frontends by @TomAugspurger in #22199
  • Support decimal operators with different scales in AST by @davidwendt in #22122
  • fix(ci): remove needs dependency on job that doesn't exist in test yaml by @gforsyth in #22211
  • Remove unused device_memory_resource includes by @bdice in #22187
  • Deduplicate lower_ir_graph in cudf-polars by @TomAugspurger in #22220
  • Pass dask.datasets.timeseries(seed=) in tests by @mroeschke in #22223
  • [Multi-GPU Polars] Gather statistics by @madsbk in #22210
  • Clean up numba extension code generation by @brandon-b-miller in #22270
  • Ignore ResourceWarning from stumpy for Python 3.14 by @mroeschke in #22272
  • Update deselected polars tests by @wence- in #22265
  • Skip Python/pandas versions without supported wheels by @vyasr in #22273
  • Generalize NormalizedPartitioning class by @rjzamora in #22246
  • Use the new compute-matrix workflow for stream testing by @KyleFromNVIDIA in #22240
  • Agent skill to build and test cudf java by @mhaseeb123 in #21894
  • Refactor DataSourceInfo by @TomAugspurger in #22254
  • Record engine_name in cudf-polars benchamrks by @TomAugspurger in #22269
  • [MINOR] Remove redundant vectors in memcpy_batch_async fast path by @nirandaperera in #22125
  • fix(cmake): exclude kvikio from install, build static, and fix some export issues introduced in 22263 by @vyasr in #22277
  • Skip two more tests by @vyasr in #22294
  • Refactor cudf-polars test suite onto pytest fixtures by @madsbk in #22212
  • Import nvcomp CMake configuration from rapids-cmake into cudf by @vyasr in #22306
  • Reset check-nightly-ci max days without success by @davidwendt in #22303
  • Lower the query plan on workers in cudf-polars by @TomAugspurger in #22287
  • Resolve timezone alias links via tzdata.zi when loading transition tables by @vuule in #22293
  • Skip zero-sized default pinned pool by @bdice in #22292
  • RayEngine: support GPU oversubscription in tests by @madsbk in #22302
  • Address cuml RandomForestClassifier(max_depth=) deprecation by @mroeschke in #22324
  • Add DuckDB resource-limit options to benchmark runner by @Matt711 in #22266
  • Refactor cudf-polars test fixtures away from indirect parametrization by @madsbk in #22325
  • Add call to reset_current_device_resource in gtests fixtures by @davidwendt in #22267
  • cudf-polars: add RayEngine._reset() by @madsbk in #22348
  • StreamingEngine._reset() by @madsbk in #22364
  • Improve hstack lowering by @rjzamora in #22353
  • Replace LD_PRELOAD hack with compute-sanitizer by @KyleFromNVIDIA in #22290
  • Run all nvbench benchmarks with timeout in smoketest by @bdice in #20538
  • Rename build/probe to right/left in hash_join and distinct_hash_join by @PointKernel in #22382
  • Use token.rapids.nvidia.com when issuing S3 bucket creds in devcontainers by @trxcllnt in #22338
  • Use static cudart by default by @KyleFromNVIDIA in #22397
  • Use cudaStream_t instead of cuda_stream_view in pylibcudf Cython by @vyasr in #22368
  • Use language: script for cudf-polars-ir-signatures pre-commit hook by @vyasr in #22384
  • Fix potential errors in Parquet page header decode by @mhaseeb123 in #22274
  • Refactor sort_actor to prepare for OrderScheme changes by @rjzamora in #22350
  • Run the cudf-polars test suite against DaskEngine and RayEngine by @madsbk in #22381
  • Move table_device_view function definitions from .cuh to .cu by @davidwendt in #22354
  • Fallback to async-mr for the multithreaded parquet example by @mhaseeb123 in #22245
  • fix(ci): resolve all zizmor findings and add zizmor pre-commit checks by @gforsyth in #22343
  • Adopt OrderScheme metadata in cudf-polars by @rjzamora in #22291
  • Consolidate evaluate_rapidsmpf into evaluate_streaming in cudf_polars by @mroeschke in #22417
  • Add ray run_constraints in cudf_polars conda recipe by @mroeschke in #22414
  • Improve installation hygiene of built and header-only dependencies by @vyasr in #22341
  • Run conda, cudf_polars CI tests with Ray by @mroeschke in #22420
  • Support Buffer's in HybridScanReader methods needing bytes-like data by @jakirkham in #22345
  • Implement equality of two table_views by @wence- in #22319
  • fix(ci): use sha for the only allowlisted version of action-add-assignees by @gforsyth in #22453
  • Update default memory resource for cudf-polars by @TomAugspurger in #22426
  • fix(ci): add explicit actions: write permission for telemetry-summarize
    by @gforsyth in #22479
  • Split PR devcontainer CI into pip and conda jobs by @bdice in #22490
  • Undo some CCCL workarounds fixed in the latest update by @davidwendt in #22475
  • Build and test with CUDA 13.2.0 by @bdice in #22463
  • Remove wheel-tests-cudf-polars-with-rapidsmpf in favor of existing wheel-tests-cudf-polars by @mroeschke in #22467
  • Remove anonymous namespaces from cudf headers by @PointKernel in #22418
  • Relax NormalizedPartitioning.from_keys by @rjzamora in #22483
  • Reduce peak footprint of cudf-polars test memory usage by @wence- in #22493
  • Use basic OrderScheme metadata in sort_actor by @rjzamora in #22477
  • Add pinned_max_pool_size and unbounded_file_read_cache to StreamingOptions by @madsbk in #22501
  • Implement our own to_thread offload for cudf-polars streaming execution by @wence- in #22474
  • Remove --broadcast-join-limit by @rjzamora in #22499
  • Revert FD_GROUPBY_REWRITE in TPC-DS benchmark queries by @Matt711 in #22525
  • Bump polars upper bound to <1.40 by @Matt711 in #22048
  • More Polars plan optimizations for TPC-DS by @Matt711 in #22395
  • Remove bad algorithmic behaviour when reserving collective IDs by @wence- in #22604
  • Add configuration hints when run_actor_network raises a memory error by @rjzamora in #22561
  • Simplify validation in cudf-polars benchmark by @wence- in #22600
  • skip CuPy 14.1.0 by @jameslamb in #22702

New Contributors

Full Changelog: v26.06.00a...v26.06.00

Don't miss a new cudf release

NewReleases is sending notifications on new releases.