🚨 Breaking Changes
- Remove UCX-Py (#19979) @pentschev
- Revert "Migrate mixed join to use multiset #19660" (#19933) @PointKernel
- Fill missing values in
Series/Index.values
for numeric types with np.nan by default (#19923) @mroeschke - Remove deprecated
DataFrame.apply_rows
, deprecateDataFrame.apply_chunks
andGroupby.apply_grouped
(#19896) @mroeschke - Move prefetching out of experimental and simplify the API (#19875) @vyasr
- Add join
*_match_context
APIs to hash join (#19835) @PointKernel - Vendor libnvcomp in libcudf (#19743) @bdice
- Migrate mixed join to use multiset (#19660) @PointKernel
- Separate row mask and page mask computation and usage (#19537) @mhaseeb123
- [FEA] Implement null-aware transforms and filters (#19502) @lamarrr
- Support output-type for MEDIAN/QUANTILE aggregation in cudf::reduce (#19267) @davidwendt
🐛 Bug Fixes
- Fix edge cases in statistics collection (#20094) @rjzamora
- Fix multi-partition
Filter
bug (#20075) @rjzamora - Fix
reindex
to fill only the reindexed values withfill_value
(#20063) @galipremsagar - Fix arrow arrays + numpy ufunc interaction (#20047) @galipremsagar
- Fix race conditions in ORC reader decimal decoding (#20044) @vuule
- Keep mr alive along with arrow tables and columns (#20028) @vyasr
- Fix
value_counts
missingnan
bug (#20026) @galipremsagar - Compatibility for rapidsmpf's unspill_partitions (#20020) @TomAugspurger
- Fix type metadata preservation in
shift
(#20017) @galipremsagar - Fix incorrect type propagation in dataframe assignment (#20010) @galipremsagar
- Fix OOB memory read in decode_page_data_generic kernel (#19995) @davidwendt
- Fix data_type creation in ast::operation::instantiate (#19994) @davidwendt
- Skip Narwhals pandas get_dtype_backend[pyarrow] tests after ArrowDtype proxy changes (#19992) @Matt711
- Make cudf.pandas callables usable with inspect.getfullargspec (#19988) @mroeschke
- Align decimal dtypes to schema after parquet IO scan (#19974) @Matt711
- Avoid undefined numpy protocols on cudf.pandas proxy objects (#19968) @mroeschke
- Skip failing polars iceberg test (#19955) @Matt711
- Revert "Migrate mixed join to use multiset #19660" (#19933) @PointKernel
- Define FrozenList proxy independently in cudf.pandas (#19931) @mroeschke
- Ignore scalars when broadcasting for horizontal string concatenation in cudf-polars (#19893) @Matt711
- Fix is_valid_rolling_aggregation for STD aggregation (#19888) @davidwendt
- Fix a decompression parameter in the chunked ORC reader (#19882) @vuule
- Skip flaky stats tests pending follow up (#19881) @brandon-b-miller
- Require list type for is_valid_aggregation and MERGE_LISTS/SETS (#19876) @davidwendt
- Temporary solution to ensure data-source/sink stream ordering (#19874) @kingcrimsontianyu
- Check for integer overflow in cudf::strings::find_multiple (#19867) @davidwendt
- Fix missing stream from cudf::top_k_order (#19866) @davidwendt
- Disallow loc.setitem with list-like indexer when list elements not in index (#19851) @mroeschke
- Fix .str.replace ignoring n for single character replacements (#19848) @mroeschke
- Fix strings::find_instance warp parallel logic (#19845) @davidwendt
- Add changed-files to the needs of every job that requires it (#19830) @Matt711
- xfail polars
decimal(precision=None)
test (#19821) @Matt711 - Fix empty column returned by cudf::from_arrow_stream_column (#19812) @davidwendt
- Filter pandas warning in dask_cudf test (#19808) @TomAugspurger
- Update identify_stream_usage CUDA runtime hooks to CUDA 13 (#19807) @robertmaynard
- When bundling
libnvcomp.so.X
only append the major version value (#19786) @robertmaynard - Improvements to
pylibcudf.from_iterable_of_py
(#19781) @Matt711 - Avoid using multiple
Cache
nodes with the same hash (#19769) @rjzamora - Fix window var() test failures from float rounding (#19761) @Matt711
- Use
is_compressed
field from Parquet V2 data page headers to determine if they are compressed (#19755) @mhaseeb123 - Fix bug in
eval
function withnvtx-0.2.11
(#19754) @galipremsagar - Fix ndsh benchmarks nvtx range usage (#19753) @davidwendt
- Support
nan
in non-floating point column in cudf-polars (#19742) @Matt711 - Fix filter call in benchmark (#19732) @vyasr
- Suppress NVRTC warning from stdint.h (#19712) @davidwendt
- Correctly decode boolean lists in chunked parquet reader (#19707) @mhaseeb123
- Add new xfails for xarray release (#19705) @vyasr
- Fix "--executor" pytest parameter for cudf-polars (#19703) @rjzamora
- Match polars semantics for rolling-sum with all-null windows (non-empty) (#19680) @Matt711
- [BUG] Set
query_set
arg when validating/running cudf-polars PDS-DS benchmarks (#19674) @Matt711 - Fix
group_by().agg()
on non-aggregatable dtypes (#19669) @Matt711 - Fix broken links in 10min notebook (#19665) @Matt711
- Skip managed memory test if managed memory not supported in cudf-polars (#19653) @Matt711
- Fix integer overflow in warp-per-row grid calculation (#19638) @davidwendt
- Propagate exceptions thrown in async IO operations (#19628) @vuule
- Make
DataFrame.dtypes
not fallback to CPU always (#19627) @galipremsagar - Set scalar to valid in range_window_bounds unbounded/current_row (#19622) @davidwendt
- Enable data page mask computation for nullable
list
andstruct
columns (#19617) @mhaseeb123 - Fix cudf::sequence() to throw exception for invalid scalar inputs (#19612) @davidwendt
- Fix uninitialized variable and misaligned write in parquet generic decoder (#19601) @mhaseeb123
- Compatibility with rapidsmpf 25.10.0 (#19591) @TomAugspurger
- Avoid querying device memory on systems without it in dask-cudf (#19577) @Matt711
- Avoid querying device memory on systems without it in cudf-polars benchmarks (#19575) @Matt711
- Increase alignment requirement for parquet bloom filter to 256 (#19573) @mhaseeb123
- Fix strftime with non-exact %a, %A, %b, %B (#19570) @mroeschke
- Fix OOB memcheck error in group_rank_to_percentage utility (#19567) @davidwendt
- Fix logic for number of unique values generated by data profile in benchmarks (#19540) @shrshi
- Fix contiguous-split nvbench cmake build (#19534) @davidwendt
- Fix value counts expression when the column has nulls (#19524) @Matt711
- Prefer
Column.astype
overplc.unary.cast
in the fill null unary function expression (#19479) @Matt711 - Fix missing return in StringFunction.Strptime strict=True path (#19464) @Matt711
- Make dividing a boolean column return f64 dtype in cudf-polars (#19443) @Matt711
- branch-25.10-merge-branch-25.08 (#19429) @davidwendt
- Replace sprintf with std::format in libcudf parquet tests (#19364) @davidwendt
📖 Documentation
- Update missing docs (#19925) @vyasr
- Add examples of null handling to doxygen for cudf::rank (#19774) @davidwendt
- Fix cudf-polars dependency list docs (#19750) @pentschev
- Update cuDF classic testing documention regarding testing organization (#19745) @mroeschke
- Improve documentation around why we need no_gc_clear on pylibcudf Scalars (#19661) @vyasr
🚀 New Features
- Add memory resource parameters to interop, merge, and transpose (#20007) @vyasr
- Add mixed join benchmark with complex AST operators (#20004) @PointKernel
- Add memory resource arguments to join, round, and labeling (#20001) @vyasr
cudf-polars
strptime
format inference (#19997) @brandon-b-miller- Filter parquet row groups using byte offset bounds (#19991) @mhaseeb123
- Add memory resource arguments to concatenate (#19943) @vyasr
- Use column statistics to generate the physical plan in cuDF-Polars (#19940) @rjzamora
- Add all missing stream parameters (#19922) @vyasr
- Remote IO support in cudf-polars (#19921) @Matt711
- Add streams to io/timezone and io/text modules (#19913) @vyasr
- Add stream support to all nvtext modules (#19911) @vyasr
- Add streams to all top-level strings modules (#19910) @vyasr
- Update strings split APIs with stream parameters (#19909) @vyasr
- Support ordered grouped windows in cudf-polars (#19891) @Matt711
- Add local row-count and unique-count estimates to
explain(... logical=True)
(#19864) @rjzamora - Add join
*_match_context
APIs to hash join (#19835) @PointKernel - Support
rank(...).over(...)
expressions in cudf-polars (#19803) @Matt711 - Add strings to/from encoded integer APIs (#19789) @davidwendt
- Add to_arrow method to pylibcudf core types (#19787) @Matt711
- Add streams to strings convert APIs (#19780) @vyasr
- Add an option to support reading ORC timestamp column as UTC time. (#19773) @res-life
- Support null_count in groupby/rolling context (#19739) @Matt711
- Collect join-key information in cudf-polars (#19736) @rjzamora
- Add count aggregation support to cudf::reduce (#19734) @davidwendt
- [FEA] Implement AST Expression - JIT codegen (#19733) @lamarrr
- Add streams to all scalar factories (#19729) @vyasr
- Add streams to reshape (#19728) @vyasr
- Add streams to null mask APIs (#19727) @vyasr
- Add streams to column APIs (#19726) @vyasr
- Construct next-gen parquet reader with pre-populated footer (#19724) @mhaseeb123
- Require
numba-cuda>=0.19.0,<0.20.0a0
(#19711) @brandon-b-miller - Support
over
expression (window mapping) in cudf-polars (#19684) @Matt711 - Add streams support to all list APIs (#19683) @vyasr
- [FEA] Add Filter Benchmark (#19678) @lamarrr
- Add streams to pylibcudf join APIs (#19672) @vyasr
- Add streams to sorting APIs (#19671) @vyasr
- [FEA] Remove excessive copies of JITIFY's ProgramData during JIT kernel launch (#19667) @lamarrr
- Add streams to hashing APIs (#19663) @vyasr
- Use a more robust metric for sorting (de)compression tasks (#19656) @vuule
- Add streams support to datetime APIs (#19654) @vyasr
- Add streams to stream_compaction (#19651) @vyasr
- Enable casting
pl.Datetime
to integer types incudf-polars
(#19647) @brandon-b-miller - Add Java JNI interface to get Gpu UUID (#19646) @res-life
- Add reduction with overflow detection (#19641) @PointKernel
- Upgrade to nvCOMP 5.0.0.6 (#19636) @vuule
- Use the nvCOMP 5.0 API to better estimate decompression memory requirements (#19616) @vuule
- Add streams to transform and unary (#19613) @vyasr
- Add streams to all modules with 4-5 functions (#19609) @vyasr
- Enable casting integer dtypes to
pl.Datetime
viacudf-polars
(#19607) @brandon-b-miller - Add fast path for Parquet reading with predicate pushdown via AST filters (#19605) @Matt711
- Add streams to all modules with three or fewer functions (#19600) @vyasr
- Add libcudf top_k_segmented APIs (#19597) @davidwendt
- Update Arrow bounds to >=15,<22 (#19592) @bdice
- Update cudf to handle CUDA 13 changes (#19585) @robertmaynard
- Support hash-based workflow for
M2
groupby aggregation (#19569) @ttnghia - Expose
filter
andcolumns
parquet reader builder options to python (#19566) @Matt711 - [FEA] Switch to NVIDIA's JITIFY2 (#19561) @lamarrr
- Add streams to all single-function modules (#19559) @vyasr
- Add support for streams to all copying APIs. (#19553) @vyasr
- Benchmarks comparing Arrow string formats (#19552) @davidwendt
- Compile
libcudf_kafka
andcudf_kafka
with C++20 (#19543) @vuule - RapidsMPF "single" shuffle integration (#19530) @rjzamora
- Make nvCOMP ZLIB (de)compression available by default (#19528) @vuule
- Implement chunking in the next-gen parquet reader (#19526) @mhaseeb123
- Add primitive row dispatch support for semi/anti join and cudf::contains (#19518) @PointKernel
- Derive and use page mask at subpass level for chunked reads (#19515) @mhaseeb123
- [FEA] Implement null-aware transforms and filters (#19502) @lamarrr
- Add PDS-DS queries 2 through 10 to cudf-polars benchmarks (#19488) @Matt711
- Add API to "initialize" column statistics (#19447) @rjzamora
- Implement top k expression in cudf-polars using
cudf::top_k
(#19431) @Matt711 - Add hash-based SUM_WITH_OVERFLOW aggregation for INT64 values (#19403) @PointKernel
- Support rank expression in cudf-polars (#19340) @Matt711
- Support fill_null with fill strategy in cudf-polars (#19318) @Matt711
- Support output-type for MEDIAN/QUANTILE aggregation in cudf::reduce (#19267) @davidwendt
- Support ternary expression inside groupby/rolling context (#19242) @Matt711
- Experimental API to read a parquet table, build a custom index column, and apply roaring bitmap deletion vector (#19237) @mhaseeb123
- Support
cudf-polars
str.zfill
(#19081) @brandon-b-miller - [FEA] Add chunked Parquet sink support using the libcudf writer (#19015) @Matt711
- Add multi-column support for primitive row operator dispatch (#18940) @tgujar
🛠️ Improvements
- Fix CI failures for
pandas-2.3.3
(#20146) @galipremsagar - Skip passing failures for latest
numexpr
version (#20092) @galipremsagar - Empty commit to trigger a build (#20084) @msarahan
- Update the reason to skip for parquet bloom filter test (#20043) @mhaseeb123
- Remove test_scan_hf_url_raises (#20035) @mroeschke
- xfail(strict=False) test_scan_hf_url_raises due to rate limiting (#20027) @mroeschke
- Deprecate left semi- and anti- join functional APIs (#20014) @shrshi
- Use to_arrow methods throughout pylibcudf and cudf (#20013) @Matt711
- Fix chunked reads of list of bools. (#20000) @pmattione-nvidia
- Raise more exceptions for invalid or unsupported cuDF arguments (#19990) @mroeschke
- Configure repo for automatic release notes generation (#19984) @AyodeAwe
- Pin duckdb<1.4 in test_python_narwhals (#19982) @mroeschke
- Default to False if
CUDA_ENABLE_NRT
isn't set in config (#19981) @brandon-b-miller - Remove UCX-Py (#19979) @pentschev
- Add support for
attrs
(#19978) @galipremsagar - Run pytest-benchmarks in CI with --benchmark-disable (#19969) @mroeschke
- Change target type so we can test on workflows (#19963) @vyasr
- Update to actions/labeler v5 (#19962) @vyasr
- Revert "ci(labeler): update labeler action to @v5" (#19961) @vyasr
- Add
ArrowDtype
proxy class (#19960) @galipremsagar - Add missing type stub (#19958) @vyasr
- Add missing
Styler
attributes (#19956) @galipremsagar - Allow newer CMake in Java tests (#19949) @bdice
- Make stream a required parameter for from_libcudf methods (#19945) @vyasr
- Return False instead of NA for comparison ops against NA in cudf.pandas (#19942) @mroeschke
- Don't fall back in Series.describe in cudf.pandas for numeric types (#19941) @mroeschke
- Move groupby benchmarks to nvbench (#19930) @davidwendt
- Perform more input validation in cuDF classic APIs (#19929) @mroeschke
- update nvidia-ml-py (>=12), use cuda-toolkit wheels (#19927) @jameslamb
- Fill missing values in
Series/Index.values
for numeric types with np.nan by default (#19923) @mroeschke - Add
rmm-release-threshold
to pdsh benchmarks CLI (#19918) @TomAugspurger - Also use the CUDA 12 container for nightlies (#19917) @vyasr
- Move test_binops.py to new cuDF classic directory structure (#19914) @mroeschke
- Eagerly load nvCOMP library in
cudf::initialize()
(#19906) @vuule - Pin to CUDA 12 image for integration tests (#19903) @vyasr
- Use branch-25.10 again (#19902) @jameslamb
- Disable test on non-default stream (#19901) @vyasr
- Use cupy array instead of numba device array as inputs to jit routines (#19897) @mroeschke
- Remove deprecated
DataFrame.apply_rows
, deprecateDataFrame.apply_chunks
andGroupby.apply_grouped
(#19896) @mroeschke - Move test_dataframe.py to new cuDF classic directory structure (#19890) @mroeschke
- Make sure conftest fixture data is valid on exit (#19889) @vyasr
- Move test_index/multiindex/indexing.py to new cuDF classic directory structure (#19887) @mroeschke
- [FEA] Build CUDF with CCCL 3.1.0 (#19886) @lamarrr
- Coalesce IO of chunks with different compression when reading Parquet files (#19884) @vuule
- Update boost version to 1.79 for JNI dockerfile (#19883) @pxLi
- Move test_categorical/dask/serialize.py to new cuDF classic test directory structure (#19877) @mroeschke
- Move prefetching out of experimental and simplify the API (#19875) @vyasr
- Remove
diff.sh
and merge diff generation intorun.sh
(#19871) @galipremsagar - Remove pyarrow upper bound (#19870) @vyasr
- Prevent installation of pytest-rerunfailures 16.0.0 (#19863) @pentschev
- use 'nvidia-ml-py' package for 'pynvml' module (#19862) @jameslamb
- Avoid more direct construction of cuDF classic columns (#19858) @mroeschke
- Bump pandas supported version to
2.3.2
(#19856) @galipremsagar - Use cupy arrays instead of numba device arrays for cuDF classic intermediates (#19855) @mroeschke
- Move row operators to detail and deprecate legacy (#19849) @PointKernel
- Fix flaky DataFrame
to_string
test (#19847) @brandon-b-miller - Pin pytest-rerunfailures<16 (#19846) @mroeschke
- revert numba CUDA 13 workaround (#19842) @jameslamb
- Avoid CategoricalColumn constructors in cuDF classic (#19837) @mroeschke
- Construct cuDF classic Decimal32/64Columns from RMM buffers (#19834) @mroeschke
- Avoid direct construction of cuDF classic columns (#19829) @mroeschke
- Support input filename in ndsh q01 benchmark (#19820) @davidwendt
- Run cudf-polars-polars-tests on changes in test_python file group (#19819) @mroeschke
- Remove test_mvc.py (#19816) @mroeschke
- pin oldest numpy in dask-cudf tests, update dependency floors (cuda-python 12.9.2, cupy 13.6.0, numba 0.60.0) (#19806) @jameslamb
- Remove iterative
nan
&nat
inefficient checks inas_column
constructor (#19804) @galipremsagar - Simplify/consolidate from_arrow logic (#19801) @mroeschke
- Refactor column_empty to use only pylibcudf APIs (#19800) @mroeschke
- Use more cached_property where possible for Index and subclasses (#19799) @mroeschke
- Update rapids-dependency-file-generator (#19796) @KyleFromNVIDIA
- rearrange dependencies.yaml, other small changes (#19794) @jameslamb
- Update exception handling in pdsh benchmarks (#19793) @TomAugspurger
- Fix how nvcomp major version is extracted (#19791) @KyleFromNVIDIA
- Use KvikIO's unified interface to create remote I/O endpoints (#19788) @kingcrimsontianyu
- Add object-oriented APIs for left semi- and anti- join (Part I) (#19778) @shrshi
- Add nvbench benchmark for cudf::encode API (#19777) @davidwendt
- Some clarifications, improvements to GroupedRollingWindows in cudf-polars (#19776) @Matt711
- Remove validation on import (#19775) @vyasr
- Move more test_dataframe.py tests to new cudf classic testing directory (#19770) @mroeschke
- Build and test with CUDA 13.0.0 (#19768) @jameslamb
- Skip polars CPU perf test for with_columns (#19763) @Matt711
- Optionally capture Shuffle Stats in cudf-polars pdsh benchmarks (#19762) @TomAugspurger
- Expand compression codec coverage in ORC and Parquet benchmarks (#19760) @vuule
- Add
ColumnSourceInfo
convenience layer (#19752) @rjzamora - Support decimal columns in cudf_polars (#19749) @mroeschke
- Skip third-party tests when possible (#19747) @vyasr
- Revert "Support decimal columns in cudf_polars" (#19746) @mroeschke
- Vendor libnvcomp in libcudf (#19743) @bdice
- Remove outdated numba workarounds (#19738) @bdice
- Move test_buffer/column/column_accesor/cuda_apply.py to new cudf classic testing directory (#19737) @mroeschke
- Move more test_dataframe.py tests to new cudf classic testing directory (#19731) @mroeschke
- Move test_udf_masked_ops/test_dropna to new cudf classic testing directory (#19730) @mroeschke
- Move test_numerical/{numpy|pandas}_interop/setitem.py to new cudf classic testing directory (#19725) @mroeschke
- Move test_timedelta/string/sorting/list/datetime.py to new cudf classic directory structure (#19723) @mroeschke
- Warn on fallback in the streaming tests in cudf-polars (#19721) @Matt711
- Optionally print shuffle stats in pdsh benchmarks (#19719) @TomAugspurger
- Move test_{io}.py files to new cudf classic test directory (#19709) @mroeschke
- Move to pyarrow and numpy to run_constrained (#19706) @vyasr
- Remove unreachable code in rapidsmpf shuffle (#19704) @TomAugspurger
- Moves test_options to cudf testing directory, clean up old, stubbed testing files in directory (#19698) @mroeschke
- Move (most of) test_index.py to new cudf classic directory structure (#19696) @mroeschke
- Improve
M2
,VARIANCE
andSTD
hash-based groupby aggregations (#19694) @ttnghia - Move quantiles libcudf benchmark to nvbench (#19692) @davidwendt
- Handle
TIMESTAMP_DAYS
in rolling window offsets (#19689) @Matt711 - Move test_groupby to new cudf classic directory structure (#19688) @mroeschke
- Move some of test_dataframe.py to new cudf classic directory structure (#19687) @mroeschke
- Change nvtext::character_tokenize to return a list column (#19685) @davidwendt
- Split up rolling.cuh into separate headers (#19682) @davidwendt
- Move test_factorize/drop_duplicates.py to new cudf classic test directory (#19681) @mroeschke
- Move test_offset/repr.py to new cudf classic testing directory (#19677) @mroeschke
- Move test_stats/reductions/quantile and misc to new cudf classic testing directory (#19675) @mroeschke
- Cache hash values to improve hash-based groupby performance with wide/complex table keys (#19670) @ttnghia
- Move test_interval/test_dtypes/test_rank.py to new cudf directory structure (#19668) @mroeschke
- Clean and move test_join_order/interpolate/onehot.py to new cudf classic test directory structure (#19662) @mroeschke
- Migrate mixed join to use multiset (#19660) @PointKernel
- Run pylibcudf tests without its optional dependencies (#19657) @vyasr
- Use build cluster in devcontainers (#19652) @trxcllnt
- Use rapids_cuda_enable_fatbin_compression (#19650) @robertmaynard
- Re-enable Disabled Join Tests (#19649) @PointKernel
- Use public Arrow functions for TDigest in PercentileApproxInputTypesTests (#19648) @davidwendt
- Use cudaDeviceGetAttribute to get ComputeMode for CUDA13 (#19645) @GaryShen2008
- remove initial memset of values in parquet reader (#19643) @pmattione-nvidia
- Move ~half of test_groupby.py to new cudf classic test directory structure (#19640) @mroeschke
- Move test_csv/feather/json.py to new cudf classic test directory structure (#19639) @mroeschke
- Move test_array_function/ufunc to new cudf classic test directory structure (#19637) @mroeschke
- Fix anchor naming conventions in dependencies.yaml (#19635) @KyleFromNVIDIA
- Require
--scale
for PDS-DS benchmarks (due to nonlinear scaling) (#19631) @Matt711 - Move test_replace.py to new cudf classic directory structure (#19629) @mroeschke
- Move test_concat/test_reductions.py to new cudf classic directory structure (#19626) @mroeschke
- Update rapids_config to handle user defined branch name (#19623) @robertmaynard
- Add nvtx ranges to public APIs of the experimental parquet reader (#19618) @mhaseeb123
- Move test_resampling/query/pickling to new cudf classic directory structure (#19615) @mroeschke
- Move test_reshape.py to new cudf classic directory strucutre, remove reshape._merge_sorted (#19614) @mroeschke
- Move test_rolling/ewm.py to new cudf classic directory structure (#19611) @mroeschke
- Simplify cudf::scalar usage in reduce utility (#19608) @davidwendt
- Update to numba-cuda>=0.18.0,<0.19.0 (#19604) @bdice
- Update spark-rapdis-jni action to use PR's base.ref and fix issue of ccache version in dockerfile (#19603) @pxLi
- Multithreaded CPU algorithm for data page mask computation (#19602) @mhaseeb123
- Move test_cuda_array_interface/cut/dataframe_copy.py to new cudf classic test directories (#19599) @mroeschke
- Support decimal columns in cudf_polars (#19589) @mroeschke
- Preserve decimal precision in
cudf::interop::column_metadata
(#19587) @mroeschke - Always use strict zipping (#19584) @vyasr
- Pin polars version to <1.33 (#19582) @Matt711
- ci(labeler): update labeler action to @v5 (#19581) @gforsyth
- Update rapids-build-backend to 0.4.0 (#19580) @KyleFromNVIDIA
- Move (most of) test_list.py to new cudf classic test directories (#19574) @mroeschke
- Move test_monotonic.py to new cudf classic test directory structure (#19572) @mroeschke
- Additional gtests error checks for string/timestamp convert libcudf APIs (#19562) @davidwendt
- Avoid cudf.pandas fallback for
pandas.array.NumpyExtensionArray
of strings (#19558) @mroeschke - Move str accessor tests in test_string.py to new cudf classic test directory structure (#19557) @mroeschke
- Rework fill/repeat benchmark to use nvbench (#19556) @davidwendt
- Use no_validity() instead of null_probability(0) in benchmarks profile (#19554) @davidwendt
- Move (most of) test_timedelta.py and test_struct.py to new cudf classic test directory structure (#19551) @mroeschke
- Capture commit hashes in pdsh benchmarks (#19548) @TomAugspurger
- Simplify clang dependency spec (#19546) @vyasr
- Move timeout in cudf.pandas pandas unit tests script to ci script (#19542) @mroeschke
- [FEA] Refactor AST
operator_functor
s for use in JIT-compiled CUDA (#19541) @lamarrr - Construct cuDF classic columns with array_interface through pylibcudf (#19538) @mroeschke
- Separate row mask and page mask computation and usage (#19537) @mhaseeb123
- Get rid of CG logic in the mixed semi-join kernel (#19536) @PointKernel
- Construct more cuDF classic Columns with pylibcudf instead of using Buffers (#19535) @mroeschke
- Fix clang-tools version pinning (#19529) @wence-
- Add cudf_polars unit test for
is_in([])
expr (#19525) @mroeschke - Expose
nvtext::letter_type
to python (#19520) @Matt711 - Remove c++ stringview interop example (#19516) @davidwendt
- Remove cudf/_fuzz_testing directory (#19510) @mroeschke
- Add missing import of pyarrow.parquet when reading specified row_groups. (#19509) @bdice
- Don't run serial cudf_pandas tests when testing multiple pandas versions (#19507) @mroeschke
- Clean testing/_utils.py (#19506) @mroeschke
- Move some test_datetime.py tests to new cudf classic test directory structure (#19505) @mroeschke
- Move test_joining to new cudf classic test directory structure (#19501) @mroeschke
- Upgrade
gcc-toolset
for Java/JNI build to version 14 (#19500) @ttnghia - Remove deprecated subword-tokenizer APIs (#19498) @davidwendt
- Move some test_multiindex.py to new cudf classic test directory structure (#19496) @mroeschke
- Add nvtx ranges and minor fix for
lists
types in the next-gen parquet reader (#19493) @mhaseeb123 - Move test_search/test_scan/test_seriesmap.py to new cudf classic test directory structure (#19492) @mroeschke
- Improve support for sliced input on from_arrow_host APIs (#19491) @davidwendt
- Move test_avro/test_api_types.py and some DataFrame tests to new cudf classic test directory structure (#19490) @mroeschke
- Move test_series.py to new cudf classic test directory structure (#19485) @mroeschke
- Move test_testing.py to new cudf classic test directory structure (#19481) @mroeschke
- Allow latest OS in devcontainers (#19480) @bdice
- Move test_unaops/test_unique/test_transform.py to new cudf classic test directory structure (#19477) @mroeschke
- Branch 25.10 merge branch 25.08 (#19475) @davidwendt
- Use more pytest fixtures and clean data files cuDF classic tests subdirectories (#19474) @mroeschke
- Use more pytest fixtures and avoid GPU parameterization in test_binops/column/column_accessor/contains.py and more (#19473) @mroeschke
- Use more pytest fixtures and avoid GPU parameterization in test_csv/cuda_*/cut.py and more (#19463) @mroeschke
- Improve readability when printing pylibcudf enums (#19451) @Matt711
- Use more pytest fixtures and avoid GPU parameterization in cuDF classic tests (#19450) @mroeschke
- Use more pytest fixtures and avoid GPU parameterization in test_dropna/factorize.py and more (#19449) @mroeschke
- Update build infra to support new branching strategy (#19445) @robertmaynard
- Updated libcudf-example conda package to preserve directories structure (#19440) @Avinash-Raj
- Use more pytest fixtures and avoid GPU parameterization in test_groupby/index.py (#19438) @mroeschke
- Use more pytest fixtures and avoid GPU parameterization in test_indexing/joining/monotonic/multiindex.py (#19437) @mroeschke
- Use more pytest fixtures and avoid GPU parameterization in cuDF classic tests (#19436) @mroeschke
- Use more pytest fixtures and avoid GPU parameterization in test_query/rank/reduction/repr.py (#19434) @mroeschke
- Use more pytest fixtures and avoid GPU parameterization in test_replace/reshape/rolling.py (#19426) @mroeschke
- Update s3 Bucket fixture creation in test_s3 (#19424) @mroeschke
- Use more pytest fixtures and avoid GPU parameterization in cuDF classic tests (#19419) @mroeschke
- Fix various pandas test failures in
cudf.pandas
(#19372) @galipremsagar - Pin Narwhals to 1.47 (#19358) @Matt711
- Run cudf-polars tests with all supported polars versions (#19353) @Matt711
- Update
pandas-tests-diff
to only display GPU/CPU usage metrics (#19210) @galipremsagar - Use GCC 14 in conda builds. (#19192) @vyasr
- Use KvikIO's implementation of file-backed memory mapping (#19164) @kingcrimsontianyu
- Replace
rmm::device_scalar
withcudf::detail::device_scalar
due to unnecessary synchronization (Part 3 of miss-sync) (#19119) @JigaoLuo - Implement distributed sorted for
cudf_polars
(#18912) @seberg