rapidsai/cudf v25.10.00 on GitHub

🚨 Breaking Changes

Remove UCX-Py (#19979) @pentschev
Revert "Migrate mixed join to use multiset #19660" (#19933) @PointKernel
Fill missing values in Series/Index.values for numeric types with np.nan by default (#19923) @mroeschke
Remove deprecated DataFrame.apply_rows, deprecate DataFrame.apply_chunks and Groupby.apply_grouped (#19896) @mroeschke
Move prefetching out of experimental and simplify the API (#19875) @vyasr
Add join *_match_context APIs to hash join (#19835) @PointKernel
Vendor libnvcomp in libcudf (#19743) @bdice
Migrate mixed join to use multiset (#19660) @PointKernel
Separate row mask and page mask computation and usage (#19537) @mhaseeb123
[FEA] Implement null-aware transforms and filters (#19502) @lamarrr
Support output-type for MEDIAN/QUANTILE aggregation in cudf::reduce (#19267) @davidwendt

🐛 Bug Fixes

Fix edge cases in statistics collection (#20094) @rjzamora
Fix multi-partition Filter bug (#20075) @rjzamora
Fix reindex to fill only the reindexed values with fill_value (#20063) @galipremsagar
Fix arrow arrays + numpy ufunc interaction (#20047) @galipremsagar
Fix race conditions in ORC reader decimal decoding (#20044) @vuule
Keep mr alive along with arrow tables and columns (#20028) @vyasr
Fix value_counts missing nan bug (#20026) @galipremsagar
Compatibility for rapidsmpf's unspill_partitions (#20020) @TomAugspurger
Fix type metadata preservation in shift (#20017) @galipremsagar
Fix incorrect type propagation in dataframe assignment (#20010) @galipremsagar
Fix OOB memory read in decode_page_data_generic kernel (#19995) @davidwendt
Fix data_type creation in ast::operation::instantiate (#19994) @davidwendt
Skip Narwhals pandas get_dtype_backend[pyarrow] tests after ArrowDtype proxy changes (#19992) @Matt711
Make cudf.pandas callables usable with inspect.getfullargspec (#19988) @mroeschke
Align decimal dtypes to schema after parquet IO scan (#19974) @Matt711
Avoid undefined numpy protocols on cudf.pandas proxy objects (#19968) @mroeschke
Skip failing polars iceberg test (#19955) @Matt711
Revert "Migrate mixed join to use multiset #19660" (#19933) @PointKernel
Define FrozenList proxy independently in cudf.pandas (#19931) @mroeschke
Ignore scalars when broadcasting for horizontal string concatenation in cudf-polars (#19893) @Matt711
Fix is_valid_rolling_aggregation for STD aggregation (#19888) @davidwendt
Fix a decompression parameter in the chunked ORC reader (#19882) @vuule
Skip flaky stats tests pending follow up (#19881) @brandon-b-miller
Require list type for is_valid_aggregation and MERGE_LISTS/SETS (#19876) @davidwendt
Temporary solution to ensure data-source/sink stream ordering (#19874) @kingcrimsontianyu
Check for integer overflow in cudf::strings::find_multiple (#19867) @davidwendt
Fix missing stream from cudf::top_k_order (#19866) @davidwendt
Disallow loc.setitem with list-like indexer when list elements not in index (#19851) @mroeschke
Fix .str.replace ignoring n for single character replacements (#19848) @mroeschke
Fix strings::find_instance warp parallel logic (#19845) @davidwendt
Add changed-files to the needs of every job that requires it (#19830) @Matt711
xfail polars decimal(precision=None) test (#19821) @Matt711
Fix empty column returned by cudf::from_arrow_stream_column (#19812) @davidwendt
Filter pandas warning in dask_cudf test (#19808) @TomAugspurger
Update identify_stream_usage CUDA runtime hooks to CUDA 13 (#19807) @robertmaynard
When bundling libnvcomp.so.X only append the major version value (#19786) @robertmaynard
Improvements to pylibcudf.from_iterable_of_py (#19781) @Matt711
Avoid using multiple Cache nodes with the same hash (#19769) @rjzamora
Fix window var() test failures from float rounding (#19761) @Matt711
Use is_compressed field from Parquet V2 data page headers to determine if they are compressed (#19755) @mhaseeb123
Fix bug in eval function with nvtx-0.2.11 (#19754) @galipremsagar
Fix ndsh benchmarks nvtx range usage (#19753) @davidwendt
Support nan in non-floating point column in cudf-polars (#19742) @Matt711
Fix filter call in benchmark (#19732) @vyasr
Suppress NVRTC warning from stdint.h (#19712) @davidwendt
Correctly decode boolean lists in chunked parquet reader (#19707) @mhaseeb123
Add new xfails for xarray release (#19705) @vyasr
Fix "--executor" pytest parameter for cudf-polars (#19703) @rjzamora
Match polars semantics for rolling-sum with all-null windows (non-empty) (#19680) @Matt711
[BUG] Set query_set arg when validating/running cudf-polars PDS-DS benchmarks (#19674) @Matt711
Fix group_by().agg() on non-aggregatable dtypes (#19669) @Matt711
Fix broken links in 10min notebook (#19665) @Matt711
Skip managed memory test if managed memory not supported in cudf-polars (#19653) @Matt711
Fix integer overflow in warp-per-row grid calculation (#19638) @davidwendt
Propagate exceptions thrown in async IO operations (#19628) @vuule
Make DataFrame.dtypes not fallback to CPU always (#19627) @galipremsagar
Set scalar to valid in range_window_bounds unbounded/current_row (#19622) @davidwendt
Enable data page mask computation for nullable list and struct columns (#19617) @mhaseeb123
Fix cudf::sequence() to throw exception for invalid scalar inputs (#19612) @davidwendt
Fix uninitialized variable and misaligned write in parquet generic decoder (#19601) @mhaseeb123
Compatibility with rapidsmpf 25.10.0 (#19591) @TomAugspurger
Avoid querying device memory on systems without it in dask-cudf (#19577) @Matt711
Avoid querying device memory on systems without it in cudf-polars benchmarks (#19575) @Matt711
Increase alignment requirement for parquet bloom filter to 256 (#19573) @mhaseeb123
Fix strftime with non-exact %a, %A, %b, %B (#19570) @mroeschke
Fix OOB memcheck error in group_rank_to_percentage utility (#19567) @davidwendt
Fix logic for number of unique values generated by data profile in benchmarks (#19540) @shrshi
Fix contiguous-split nvbench cmake build (#19534) @davidwendt
Fix value counts expression when the column has nulls (#19524) @Matt711
Prefer Column.astype over plc.unary.cast in the fill null unary function expression (#19479) @Matt711
Fix missing return in StringFunction.Strptime strict=True path (#19464) @Matt711
Make dividing a boolean column return f64 dtype in cudf-polars (#19443) @Matt711
branch-25.10-merge-branch-25.08 (#19429) @davidwendt
Replace sprintf with std::format in libcudf parquet tests (#19364) @davidwendt

📖 Documentation

Update missing docs (#19925) @vyasr
Add examples of null handling to doxygen for cudf::rank (#19774) @davidwendt
Fix cudf-polars dependency list docs (#19750) @pentschev
Update cuDF classic testing documention regarding testing organization (#19745) @mroeschke
Improve documentation around why we need no_gc_clear on pylibcudf Scalars (#19661) @vyasr

🚀 New Features

Add memory resource parameters to interop, merge, and transpose (#20007) @vyasr
Add mixed join benchmark with complex AST operators (#20004) @PointKernel
Add memory resource arguments to join, round, and labeling (#20001) @vyasr
cudf-polars strptime format inference (#19997) @brandon-b-miller
Filter parquet row groups using byte offset bounds (#19991) @mhaseeb123
Add memory resource arguments to concatenate (#19943) @vyasr
Use column statistics to generate the physical plan in cuDF-Polars (#19940) @rjzamora
Add all missing stream parameters (#19922) @vyasr
Remote IO support in cudf-polars (#19921) @Matt711
Add streams to io/timezone and io/text modules (#19913) @vyasr
Add stream support to all nvtext modules (#19911) @vyasr
Add streams to all top-level strings modules (#19910) @vyasr
Update strings split APIs with stream parameters (#19909) @vyasr
Support ordered grouped windows in cudf-polars (#19891) @Matt711
Add local row-count and unique-count estimates to explain(... logical=True) (#19864) @rjzamora
Add join *_match_context APIs to hash join (#19835) @PointKernel
Support rank(...).over(...) expressions in cudf-polars (#19803) @Matt711
Add strings to/from encoded integer APIs (#19789) @davidwendt
Add to_arrow method to pylibcudf core types (#19787) @Matt711
Add streams to strings convert APIs (#19780) @vyasr
Add an option to support reading ORC timestamp column as UTC time. (#19773) @res-life
Support null_count in groupby/rolling context (#19739) @Matt711
Collect join-key information in cudf-polars (#19736) @rjzamora
Add count aggregation support to cudf::reduce (#19734) @davidwendt
[FEA] Implement AST Expression - JIT codegen (#19733) @lamarrr
Add streams to all scalar factories (#19729) @vyasr
Add streams to reshape (#19728) @vyasr
Add streams to null mask APIs (#19727) @vyasr
Add streams to column APIs (#19726) @vyasr
Construct next-gen parquet reader with pre-populated footer (#19724) @mhaseeb123
Require numba-cuda>=0.19.0,<0.20.0a0 (#19711) @brandon-b-miller
Support over expression (window mapping) in cudf-polars (#19684) @Matt711
Add streams support to all list APIs (#19683) @vyasr
[FEA] Add Filter Benchmark (#19678) @lamarrr
Add streams to pylibcudf join APIs (#19672) @vyasr
Add streams to sorting APIs (#19671) @vyasr
[FEA] Remove excessive copies of JITIFY's ProgramData during JIT kernel launch (#19667) @lamarrr
Add streams to hashing APIs (#19663) @vyasr
Use a more robust metric for sorting (de)compression tasks (#19656) @vuule
Add streams support to datetime APIs (#19654) @vyasr
Add streams to stream_compaction (#19651) @vyasr
Enable casting pl.Datetime to integer types in cudf-polars (#19647) @brandon-b-miller
Add Java JNI interface to get Gpu UUID (#19646) @res-life
Add reduction with overflow detection (#19641) @PointKernel
Upgrade to nvCOMP 5.0.0.6 (#19636) @vuule
Use the nvCOMP 5.0 API to better estimate decompression memory requirements (#19616) @vuule
Add streams to transform and unary (#19613) @vyasr
Add streams to all modules with 4-5 functions (#19609) @vyasr
Enable casting integer dtypes to pl.Datetime via cudf-polars (#19607) @brandon-b-miller
Add fast path for Parquet reading with predicate pushdown via AST filters (#19605) @Matt711
Add streams to all modules with three or fewer functions (#19600) @vyasr
Add libcudf top_k_segmented APIs (#19597) @davidwendt
Update Arrow bounds to >=15,<22 (#19592) @bdice
Update cudf to handle CUDA 13 changes (#19585) @robertmaynard
Support hash-based workflow for M2 groupby aggregation (#19569) @ttnghia
Expose filter and columns parquet reader builder options to python (#19566) @Matt711
[FEA] Switch to NVIDIA's JITIFY2 (#19561) @lamarrr
Add streams to all single-function modules (#19559) @vyasr
Add support for streams to all copying APIs. (#19553) @vyasr
Benchmarks comparing Arrow string formats (#19552) @davidwendt
Compile libcudf_kafka and cudf_kafka with C++20 (#19543) @vuule
RapidsMPF "single" shuffle integration (#19530) @rjzamora
Make nvCOMP ZLIB (de)compression available by default (#19528) @vuule
Implement chunking in the next-gen parquet reader (#19526) @mhaseeb123
Add primitive row dispatch support for semi/anti join and cudf::contains (#19518) @PointKernel
Derive and use page mask at subpass level for chunked reads (#19515) @mhaseeb123
[FEA] Implement null-aware transforms and filters (#19502) @lamarrr
Add PDS-DS queries 2 through 10 to cudf-polars benchmarks (#19488) @Matt711
Add API to "initialize" column statistics (#19447) @rjzamora
Implement top k expression in cudf-polars using cudf::top_k (#19431) @Matt711
Add hash-based SUM_WITH_OVERFLOW aggregation for INT64 values (#19403) @PointKernel
Support rank expression in cudf-polars (#19340) @Matt711
Support fill_null with fill strategy in cudf-polars (#19318) @Matt711
Support output-type for MEDIAN/QUANTILE aggregation in cudf::reduce (#19267) @davidwendt
Support ternary expression inside groupby/rolling context (#19242) @Matt711
Experimental API to read a parquet table, build a custom index column, and apply roaring bitmap deletion vector (#19237) @mhaseeb123
Support cudf-polars str.zfill (#19081) @brandon-b-miller
[FEA] Add chunked Parquet sink support using the libcudf writer (#19015) @Matt711
Add multi-column support for primitive row operator dispatch (#18940) @tgujar

🛠️ Improvements

Fix CI failures for pandas-2.3.3 (#20146) @galipremsagar
Skip passing failures for latest numexpr version (#20092) @galipremsagar
Empty commit to trigger a build (#20084) @msarahan
Update the reason to skip for parquet bloom filter test (#20043) @mhaseeb123
Remove test_scan_hf_url_raises (#20035) @mroeschke
xfail(strict=False) test_scan_hf_url_raises due to rate limiting (#20027) @mroeschke
Deprecate left semi- and anti- join functional APIs (#20014) @shrshi
Use to_arrow methods throughout pylibcudf and cudf (#20013) @Matt711
Fix chunked reads of list of bools. (#20000) @pmattione-nvidia
Raise more exceptions for invalid or unsupported cuDF arguments (#19990) @mroeschke
Configure repo for automatic release notes generation (#19984) @AyodeAwe
Pin duckdb<1.4 in test_python_narwhals (#19982) @mroeschke
Default to False if CUDA_ENABLE_NRT isn't set in config (#19981) @brandon-b-miller
Remove UCX-Py (#19979) @pentschev
Add support for attrs (#19978) @galipremsagar
Run pytest-benchmarks in CI with --benchmark-disable (#19969) @mroeschke
Change target type so we can test on workflows (#19963) @vyasr
Update to actions/labeler v5 (#19962) @vyasr
Revert "ci(labeler): update labeler action to @v5" (#19961) @vyasr
Add ArrowDtype proxy class (#19960) @galipremsagar
Add missing type stub (#19958) @vyasr
Add missing Styler attributes (#19956) @galipremsagar
Allow newer CMake in Java tests (#19949) @bdice
Make stream a required parameter for from_libcudf methods (#19945) @vyasr
Return False instead of NA for comparison ops against NA in cudf.pandas (#19942) @mroeschke
Don't fall back in Series.describe in cudf.pandas for numeric types (#19941) @mroeschke
Move groupby benchmarks to nvbench (#19930) @davidwendt
Perform more input validation in cuDF classic APIs (#19929) @mroeschke
update nvidia-ml-py (>=12), use cuda-toolkit wheels (#19927) @jameslamb
Fill missing values in Series/Index.values for numeric types with np.nan by default (#19923) @mroeschke
Add rmm-release-threshold to pdsh benchmarks CLI (#19918) @TomAugspurger
Also use the CUDA 12 container for nightlies (#19917) @vyasr
Move test_binops.py to new cuDF classic directory structure (#19914) @mroeschke
Eagerly load nvCOMP library in cudf::initialize() (#19906) @vuule
Pin to CUDA 12 image for integration tests (#19903) @vyasr
Use branch-25.10 again (#19902) @jameslamb
Disable test on non-default stream (#19901) @vyasr
Use cupy array instead of numba device array as inputs to jit routines (#19897) @mroeschke
Remove deprecated DataFrame.apply_rows, deprecate DataFrame.apply_chunks and Groupby.apply_grouped (#19896) @mroeschke
Move test_dataframe.py to new cuDF classic directory structure (#19890) @mroeschke
Make sure conftest fixture data is valid on exit (#19889) @vyasr
Move test_index/multiindex/indexing.py to new cuDF classic directory structure (#19887) @mroeschke
[FEA] Build CUDF with CCCL 3.1.0 (#19886) @lamarrr
Coalesce IO of chunks with different compression when reading Parquet files (#19884) @vuule
Update boost version to 1.79 for JNI dockerfile (#19883) @pxLi
Move test_categorical/dask/serialize.py to new cuDF classic test directory structure (#19877) @mroeschke
Move prefetching out of experimental and simplify the API (#19875) @vyasr
Remove diff.sh and merge diff generation into run.sh (#19871) @galipremsagar
Remove pyarrow upper bound (#19870) @vyasr
Prevent installation of pytest-rerunfailures 16.0.0 (#19863) @pentschev
use 'nvidia-ml-py' package for 'pynvml' module (#19862) @jameslamb
Avoid more direct construction of cuDF classic columns (#19858) @mroeschke
Bump pandas supported version to 2.3.2 (#19856) @galipremsagar
Use cupy arrays instead of numba device arrays for cuDF classic intermediates (#19855) @mroeschke
Move row operators to detail and deprecate legacy (#19849) @PointKernel
Fix flaky DataFrame to_string test (#19847) @brandon-b-miller
Pin pytest-rerunfailures<16 (#19846) @mroeschke
revert numba CUDA 13 workaround (#19842) @jameslamb
Avoid CategoricalColumn constructors in cuDF classic (#19837) @mroeschke
Construct cuDF classic Decimal32/64Columns from RMM buffers (#19834) @mroeschke
Avoid direct construction of cuDF classic columns (#19829) @mroeschke
Support input filename in ndsh q01 benchmark (#19820) @davidwendt
Run cudf-polars-polars-tests on changes in test_python file group (#19819) @mroeschke
Remove test_mvc.py (#19816) @mroeschke
pin oldest numpy in dask-cudf tests, update dependency floors (cuda-python 12.9.2, cupy 13.6.0, numba 0.60.0) (#19806) @jameslamb
Remove iterative nan & nat inefficient checks in as_column constructor (#19804) @galipremsagar
Simplify/consolidate from_arrow logic (#19801) @mroeschke
Refactor column_empty to use only pylibcudf APIs (#19800) @mroeschke
Use more cached_property where possible for Index and subclasses (#19799) @mroeschke
Update rapids-dependency-file-generator (#19796) @KyleFromNVIDIA
rearrange dependencies.yaml, other small changes (#19794) @jameslamb
Update exception handling in pdsh benchmarks (#19793) @TomAugspurger
Fix how nvcomp major version is extracted (#19791) @KyleFromNVIDIA
Use KvikIO's unified interface to create remote I/O endpoints (#19788) @kingcrimsontianyu
Add object-oriented APIs for left semi- and anti- join (Part I) (#19778) @shrshi
Add nvbench benchmark for cudf::encode API (#19777) @davidwendt
Some clarifications, improvements to GroupedRollingWindows in cudf-polars (#19776) @Matt711
Remove validation on import (#19775) @vyasr
Move more test_dataframe.py tests to new cudf classic testing directory (#19770) @mroeschke
Build and test with CUDA 13.0.0 (#19768) @jameslamb
Skip polars CPU perf test for with_columns (#19763) @Matt711
Optionally capture Shuffle Stats in cudf-polars pdsh benchmarks (#19762) @TomAugspurger
Expand compression codec coverage in ORC and Parquet benchmarks (#19760) @vuule
Add ColumnSourceInfo convenience layer (#19752) @rjzamora
Support decimal columns in cudf_polars (#19749) @mroeschke
Skip third-party tests when possible (#19747) @vyasr
Revert "Support decimal columns in cudf_polars" (#19746) @mroeschke
Vendor libnvcomp in libcudf (#19743) @bdice
Remove outdated numba workarounds (#19738) @bdice
Move test_buffer/column/column_accesor/cuda_apply.py to new cudf classic testing directory (#19737) @mroeschke
Move more test_dataframe.py tests to new cudf classic testing directory (#19731) @mroeschke
Move test_udf_masked_ops/test_dropna to new cudf classic testing directory (#19730) @mroeschke
Move test_numerical/{numpy|pandas}_interop/setitem.py to new cudf classic testing directory (#19725) @mroeschke
Move test_timedelta/string/sorting/list/datetime.py to new cudf classic directory structure (#19723) @mroeschke
Warn on fallback in the streaming tests in cudf-polars (#19721) @Matt711
Optionally print shuffle stats in pdsh benchmarks (#19719) @TomAugspurger
Move test_{io}.py files to new cudf classic test directory (#19709) @mroeschke
Move to pyarrow and numpy to run_constrained (#19706) @vyasr
Remove unreachable code in rapidsmpf shuffle (#19704) @TomAugspurger
Moves test_options to cudf testing directory, clean up old, stubbed testing files in directory (#19698) @mroeschke
Move (most of) test_index.py to new cudf classic directory structure (#19696) @mroeschke
Improve M2, VARIANCE and STD hash-based groupby aggregations (#19694) @ttnghia
Move quantiles libcudf benchmark to nvbench (#19692) @davidwendt
Handle TIMESTAMP_DAYS in rolling window offsets (#19689) @Matt711
Move test_groupby to new cudf classic directory structure (#19688) @mroeschke
Move some of test_dataframe.py to new cudf classic directory structure (#19687) @mroeschke
Change nvtext::character_tokenize to return a list column (#19685) @davidwendt
Split up rolling.cuh into separate headers (#19682) @davidwendt
Move test_factorize/drop_duplicates.py to new cudf classic test directory (#19681) @mroeschke
Move test_offset/repr.py to new cudf classic testing directory (#19677) @mroeschke
Move test_stats/reductions/quantile and misc to new cudf classic testing directory (#19675) @mroeschke
Cache hash values to improve hash-based groupby performance with wide/complex table keys (#19670) @ttnghia
Move test_interval/test_dtypes/test_rank.py to new cudf directory structure (#19668) @mroeschke
Clean and move test_join_order/interpolate/onehot.py to new cudf classic test directory structure (#19662) @mroeschke
Migrate mixed join to use multiset (#19660) @PointKernel
Run pylibcudf tests without its optional dependencies (#19657) @vyasr
Use build cluster in devcontainers (#19652) @trxcllnt
Use rapids_cuda_enable_fatbin_compression (#19650) @robertmaynard
Re-enable Disabled Join Tests (#19649) @PointKernel
Use public Arrow functions for TDigest in PercentileApproxInputTypesTests (#19648) @davidwendt
Use cudaDeviceGetAttribute to get ComputeMode for CUDA13 (#19645) @GaryShen2008
remove initial memset of values in parquet reader (#19643) @pmattione-nvidia
Move ~half of test_groupby.py to new cudf classic test directory structure (#19640) @mroeschke
Move test_csv/feather/json.py to new cudf classic test directory structure (#19639) @mroeschke
Move test_array_function/ufunc to new cudf classic test directory structure (#19637) @mroeschke
Fix anchor naming conventions in dependencies.yaml (#19635) @KyleFromNVIDIA
Require --scale for PDS-DS benchmarks (due to nonlinear scaling) (#19631) @Matt711
Move test_replace.py to new cudf classic directory structure (#19629) @mroeschke
Move test_concat/test_reductions.py to new cudf classic directory structure (#19626) @mroeschke
Update rapids_config to handle user defined branch name (#19623) @robertmaynard
Add nvtx ranges to public APIs of the experimental parquet reader (#19618) @mhaseeb123
Move test_resampling/query/pickling to new cudf classic directory structure (#19615) @mroeschke
Move test_reshape.py to new cudf classic directory strucutre, remove reshape._merge_sorted (#19614) @mroeschke
Move test_rolling/ewm.py to new cudf classic directory structure (#19611) @mroeschke
Simplify cudf::scalar usage in reduce utility (#19608) @davidwendt
Update to numba-cuda>=0.18.0,<0.19.0 (#19604) @bdice
Update spark-rapdis-jni action to use PR's base.ref and fix issue of ccache version in dockerfile (#19603) @pxLi
Multithreaded CPU algorithm for data page mask computation (#19602) @mhaseeb123
Move test_cuda_array_interface/cut/dataframe_copy.py to new cudf classic test directories (#19599) @mroeschke
Support decimal columns in cudf_polars (#19589) @mroeschke
Preserve decimal precision in cudf::interop::column_metadata (#19587) @mroeschke
Always use strict zipping (#19584) @vyasr
Pin polars version to <1.33 (#19582) @Matt711
ci(labeler): update labeler action to @v5 (#19581) @gforsyth
Update rapids-build-backend to 0.4.0 (#19580) @KyleFromNVIDIA
Move (most of) test_list.py to new cudf classic test directories (#19574) @mroeschke
Move test_monotonic.py to new cudf classic test directory structure (#19572) @mroeschke
Additional gtests error checks for string/timestamp convert libcudf APIs (#19562) @davidwendt
Avoid cudf.pandas fallback for pandas.array.NumpyExtensionArray of strings (#19558) @mroeschke
Move str accessor tests in test_string.py to new cudf classic test directory structure (#19557) @mroeschke
Rework fill/repeat benchmark to use nvbench (#19556) @davidwendt
Use no_validity() instead of null_probability(0) in benchmarks profile (#19554) @davidwendt
Move (most of) test_timedelta.py and test_struct.py to new cudf classic test directory structure (#19551) @mroeschke
Capture commit hashes in pdsh benchmarks (#19548) @TomAugspurger
Simplify clang dependency spec (#19546) @vyasr
Move timeout in cudf.pandas pandas unit tests script to ci script (#19542) @mroeschke
[FEA] Refactor AST operator_functors for use in JIT-compiled CUDA (#19541) @lamarrr
Construct cuDF classic columns with array_interface through pylibcudf (#19538) @mroeschke
Separate row mask and page mask computation and usage (#19537) @mhaseeb123
Get rid of CG logic in the mixed semi-join kernel (#19536) @PointKernel
Construct more cuDF classic Columns with pylibcudf instead of using Buffers (#19535) @mroeschke
Fix clang-tools version pinning (#19529) @wence-
Add cudf_polars unit test for is_in([]) expr (#19525) @mroeschke
Expose nvtext::letter_type to python (#19520) @Matt711
Remove c++ stringview interop example (#19516) @davidwendt
Remove cudf/_fuzz_testing directory (#19510) @mroeschke
Add missing import of pyarrow.parquet when reading specified row_groups. (#19509) @bdice
Don't run serial cudf_pandas tests when testing multiple pandas versions (#19507) @mroeschke
Clean testing/_utils.py (#19506) @mroeschke
Move some test_datetime.py tests to new cudf classic test directory structure (#19505) @mroeschke
Move test_joining to new cudf classic test directory structure (#19501) @mroeschke
Upgrade gcc-toolset for Java/JNI build to version 14 (#19500) @ttnghia
Remove deprecated subword-tokenizer APIs (#19498) @davidwendt
Move some test_multiindex.py to new cudf classic test directory structure (#19496) @mroeschke
Add nvtx ranges and minor fix for lists types in the next-gen parquet reader (#19493) @mhaseeb123
Move test_search/test_scan/test_seriesmap.py to new cudf classic test directory structure (#19492) @mroeschke
Improve support for sliced input on from_arrow_host APIs (#19491) @davidwendt
Move test_avro/test_api_types.py and some DataFrame tests to new cudf classic test directory structure (#19490) @mroeschke
Move test_series.py to new cudf classic test directory structure (#19485) @mroeschke
Move test_testing.py to new cudf classic test directory structure (#19481) @mroeschke
Allow latest OS in devcontainers (#19480) @bdice
Move test_unaops/test_unique/test_transform.py to new cudf classic test directory structure (#19477) @mroeschke
Branch 25.10 merge branch 25.08 (#19475) @davidwendt
Use more pytest fixtures and clean data files cuDF classic tests subdirectories (#19474) @mroeschke
Use more pytest fixtures and avoid GPU parameterization in test_binops/column/column_accessor/contains.py and more (#19473) @mroeschke
Use more pytest fixtures and avoid GPU parameterization in test_csv/cuda_*/cut.py and more (#19463) @mroeschke
Improve readability when printing pylibcudf enums (#19451) @Matt711
Use more pytest fixtures and avoid GPU parameterization in cuDF classic tests (#19450) @mroeschke
Use more pytest fixtures and avoid GPU parameterization in test_dropna/factorize.py and more (#19449) @mroeschke
Update build infra to support new branching strategy (#19445) @robertmaynard
Updated libcudf-example conda package to preserve directories structure (#19440) @Avinash-Raj
Use more pytest fixtures and avoid GPU parameterization in test_groupby/index.py (#19438) @mroeschke
Use more pytest fixtures and avoid GPU parameterization in test_indexing/joining/monotonic/multiindex.py (#19437) @mroeschke
Use more pytest fixtures and avoid GPU parameterization in cuDF classic tests (#19436) @mroeschke
Use more pytest fixtures and avoid GPU parameterization in test_query/rank/reduction/repr.py (#19434) @mroeschke
Use more pytest fixtures and avoid GPU parameterization in test_replace/reshape/rolling.py (#19426) @mroeschke
Update s3 Bucket fixture creation in test_s3 (#19424) @mroeschke
Use more pytest fixtures and avoid GPU parameterization in cuDF classic tests (#19419) @mroeschke
Fix various pandas test failures in cudf.pandas (#19372) @galipremsagar
Pin Narwhals to 1.47 (#19358) @Matt711
Run cudf-polars tests with all supported polars versions (#19353) @Matt711
Update pandas-tests-diff to only display GPU/CPU usage metrics (#19210) @galipremsagar
Use GCC 14 in conda builds. (#19192) @vyasr
Use KvikIO's implementation of file-backed memory mapping (#19164) @kingcrimsontianyu
Replace rmm::device_scalar with cudf::detail::device_scalar due to unnecessary synchronization (Part 3 of miss-sync) (#19119) @JigaoLuo
Implement distributed sorted for cudf_polars (#18912) @seberg