This release contains significant upgrades to Modin's documentation,
support for pandas 1.4, new algebra and partitioning layer APIs, and some bugfixes.
Key Features and Updates
- Stability and bugfixes
- Support for subscripting Resampler (1a1edfd)
- Fix groupby with column name for
by
(a04d7b7) - Workaround for groupby with
sort=False
with categorical keys (c67a7c5) - Align default value of
REDIS_PASSWORD
with Ray'sDEFAULT_REDIS_PASSWORD
(f79cb85) - Fix groupby dictionary aggregation when
by
and columns to aggregate overlap (d42c070) - Fix
read_csv
when callables are provided forskip_rows
parameter (7c84758) - Ensure address is not passed to
ray.init
when running Ray in local mode (02a23d4) - Ensure that
groupby.indices
returns positional indices (e9c06f2) - Fix setting of categorical values (0e36e22)
- Ensure
df.__getitem__
respects step attribute of slice (7e85c5d) - Ensure data argument is delievered to the Dataframe in experimental cloud mode (2f7da1f)
- Fix assigning to a Series with a single item (0d9d14e)
- Fix the default to pandas in pd.DataFrame.sparse.from_spmatrix (ab2855b)
- Fix
apply
result type inference (ac17ca1) - Exclude "scripts" from setup package (6224aba)
- Fix assigning a Categorical to a column (cb4e727)
- Ensure
df.to_csv
propagates metadata (e.g. index) (154697b) - Update
pyarrow
requirement in environment files (b55b08d)
- Performance enhancements
- Benchmarking enhancements
- Update benchmarks for groupby that are more representative (0582aa2)
- Refactor Codebase
- Pandas API implementations and improvements
- Add support for
storage_options
argument toread_csv_glob
(7c33afe) - Add support for
dropna
argument forgroupby.indices
andgroupby.groups
(144a613) - Ensure relabeling Modin Frame does not lose partition shape (3c740db)
- Update
Series.values
to default toto_numpy()
(67228ef) - Add support for
modin.pandas.show_versions
andpython -m modin --versions
(efe717f) - Upgrade pandas support to 1.4 (39fbc57)
- Add support for
- OmniSci enhancements
- Update benchmarks for groupby that are more representative (9396f23)
- Update documentation on Native + OmniSci (edc1608)
- Add support for
getArrowTable()
(6882ec2) - Fix segfault during
init
when only OmniSci is present (8c8a6a3) - Optimize
append
with default arguments (67013f9) - Fix OmniSci engine enabling for IO functions (9d1a334)
- XGBoost enhancements
- Developer API enhancements
- Update testing suite
- Documentation improvements
- Improve documentation on pandas on Ray execution (b76dc57)
- Reformat documentation to match pandas documentation theme (cc96f5d)
- Improve documentation on pandas on Python execution (d590de0)
- Improve System view in architecture documentation (6d51921)
- Improve documentation on using pandas on Dask (003f338)
- Improve documentation on pandas on Dask execution (61bf043)
- Add documentation on using pandas on Python (195b668)
- Improve Modin Out of Core documentation (cf426c4)
- Improve documentation on OmniSci on native execution (689faee)
- Improve documentation on IO (ffa67c7)
- Add documentation on factories and parsers (6ca66db)
- Improve documentation for experimental pandas on Ray execution (20abddd)
- Improve documentation for
modin.core.dataframe.base
andmodin.core.dataframe.pandas
(cf1e541) - Update troubleshooting documentation and add FAQs (cc95ae2)
- Improve README introduction and installation sections (a632d1f)
- Update copyright year (7da1dc8)
- Update a link to
pandas.read_json
(0315823) - Improve documentation for Modin vs. Dask (34732cb)
- Fix links to the contributing page (81a06d6)
- Remove broken links from supported apis (c04502d)
- Change docs copyright statement to 'Modin Developers' (ed2a7a4)
- Rename Developer page to Development in docs (406af7c)
- Improve "Getting Started" section (4a62bba)
- Update Modin tutorials (76707bf)
- Add back quickstart notebook (4dd97ab)
- Fix links in README and update README and FAQs (5d84042)
- Update Modin module layout in architecture docs (7fcafa7)
- Update documentation with new algebra operators and
ModinDataframe
(4b70725) - Add usage guide to documentation (4511566)
- Build docs with Python 3.8 (01c1876)
- Dependencies
- Update PyArrow to 6.0 and OmniSci to 5.10.1 (018515f)
Contributors
@anmyachev, @prutskov, @Rubtsowa, @vnlitvinov, @dchigarev, @YarShev, @amyskov,
@mvashishtha, @dorisjlee, @devin-petersohn, @jeffreykennethli, @RehanSD,
@novichkovg, @Lozovskii-Aleksandr, @naren-ponder, @ahallermed, @fexolm,
@adityagp, @susmitpy, @ienkovich