github modin-project/modin 0.13.0
Modin 0.13.0

latest releases: 0.30.0, 0.29.0, 0.28.2...
2 years ago

This release contains significant upgrades to Modin's documentation,
support for pandas 1.4, new algebra and partitioning layer APIs, and some bugfixes.

Key Features and Updates

  • Stability and bugfixes
    • Support for subscripting Resampler (1a1edfd)
    • Fix groupby with column name for by (a04d7b7)
    • Workaround for groupby with sort=False with categorical keys (c67a7c5)
    • Align default value of REDIS_PASSWORD with Ray's DEFAULT_REDIS_PASSWORD (f79cb85)
    • Fix groupby dictionary aggregation when by and columns to aggregate overlap (d42c070)
    • Fix read_csv when callables are provided for skip_rows parameter (7c84758)
    • Ensure address is not passed to ray.init when running Ray in local mode (02a23d4)
    • Ensure that groupby.indices returns positional indices (e9c06f2)
    • Fix setting of categorical values (0e36e22)
    • Ensure df.__getitem__ respects step attribute of slice (7e85c5d)
    • Ensure data argument is delievered to the Dataframe in experimental cloud mode (2f7da1f)
    • Fix assigning to a Series with a single item (0d9d14e)
    • Fix the default to pandas in pd.DataFrame.sparse.from_spmatrix (ab2855b)
    • Fix apply result type inference (ac17ca1)
    • Exclude "scripts" from setup package (6224aba)
    • Fix assigning a Categorical to a column (cb4e727)
    • Ensure df.to_csv propagates metadata (e.g. index) (154697b)
    • Update pyarrow requirement in environment files (b55b08d)
  • Performance enhancements
    • Optimize __getitem__ flow for .loc/.iloc (0947ee8)
    • Delay instantiation of lazy dtypes on transpose (cd8db0c)
  • Benchmarking enhancements
    • Update benchmarks for groupby that are more representative (0582aa2)
  • Refactor Codebase
    • Update CODEOWNERS to reflect repository after refactor (cde6390)
    • Remove duplicate import of FactoryDispatcher in Modin experimental pandas IO (2cfabaf)
    • Update Modin to incorporate dataframe algebra (58bbcc3)
  • Pandas API implementations and improvements
    • Add support for storage_options argument to read_csv_glob (7c33afe)
    • Add support for dropna argument for groupby.indices and groupby.groups (144a613)
    • Ensure relabeling Modin Frame does not lose partition shape (3c740db)
    • Update Series.values to default to to_numpy() (67228ef)
    • Add support for modin.pandas.show_versions and python -m modin --versions (efe717f)
    • Upgrade pandas support to 1.4 (39fbc57)
  • OmniSci enhancements
    • Update benchmarks for groupby that are more representative (9396f23)
    • Update documentation on Native + OmniSci (edc1608)
    • Add support for getArrowTable() (6882ec2)
    • Fix segfault during init when only OmniSci is present (8c8a6a3)
    • Optimize append with default arguments (67013f9)
    • Fix OmniSci engine enabling for IO functions (9d1a334)
  • XGBoost enhancements
  • Developer API enhancements
    • Add parameter for minimum partition size (1be66d1)
    • Improve documentation for read_csv_glob and ensure warning raised if wildcard not in filepath_or_buffer (be10ba9)
    • Expand virtual partitioning utility (8d1004f)
  • Update testing suite
  • Documentation improvements
    • Improve documentation on pandas on Ray execution (b76dc57)
    • Reformat documentation to match pandas documentation theme (cc96f5d)
    • Improve documentation on pandas on Python execution (d590de0)
    • Improve System view in architecture documentation (6d51921)
    • Improve documentation on using pandas on Dask (003f338)
    • Improve documentation on pandas on Dask execution (61bf043)
    • Add documentation on using pandas on Python (195b668)
    • Improve Modin Out of Core documentation (cf426c4)
    • Improve documentation on OmniSci on native execution (689faee)
    • Improve documentation on IO (ffa67c7)
    • Add documentation on factories and parsers (6ca66db)
    • Improve documentation for experimental pandas on Ray execution (20abddd)
    • Improve documentation for modin.core.dataframe.base and modin.core.dataframe.pandas (cf1e541)
    • Update troubleshooting documentation and add FAQs (cc95ae2)
    • Improve README introduction and installation sections (a632d1f)
    • Update copyright year (7da1dc8)
    • Update a link to pandas.read_json (0315823)
    • Improve documentation for Modin vs. Dask (34732cb)
    • Fix links to the contributing page (81a06d6)
    • Remove broken links from supported apis (c04502d)
    • Change docs copyright statement to 'Modin Developers' (ed2a7a4)
    • Rename Developer page to Development in docs (406af7c)
    • Improve "Getting Started" section (4a62bba)
    • Update Modin tutorials (76707bf)
    • Add back quickstart notebook (4dd97ab)
    • Fix links in README and update README and FAQs (5d84042)
    • Update Modin module layout in architecture docs (7fcafa7)
    • Update documentation with new algebra operators and ModinDataframe (4b70725)
    • Add usage guide to documentation (4511566)
    • Build docs with Python 3.8 (01c1876)
  • Dependencies
    • Update PyArrow to 6.0 and OmniSci to 5.10.1 (018515f)

Contributors

@anmyachev, @prutskov, @Rubtsowa, @vnlitvinov, @dchigarev, @YarShev, @amyskov,
@mvashishtha, @dorisjlee, @devin-petersohn, @jeffreykennethli, @RehanSD,
@novichkovg, @Lozovskii-Aleksandr, @naren-ponder, @ahallermed, @fexolm,
@adityagp, @susmitpy, @ienkovich

Don't miss a new modin release

NewReleases is sending notifications on new releases.