Changes
💡 New Features
- allow inclusion in C programs @drewmiller (#4608)
- add param aliases from scikit-learn @StrikerRUS (#4637)
- [python] add placeholders to titles in plotting functions @StrikerRUS (#4614)
- [python-package] Support 2d collections as input for
init_score
in multiclass classification task @jmoralez (#4150) - [python] add parameter object_hook to method dump_model @xadupre (#4533)
- [python] support Dataset.get_data for Sequence input. @cyfdecyf (#4472)
- [python] allow to pass some params as pathlib.Path objects @StrikerRUS (#4440)
- [python-package] Create Dataset from multiple data files @cyfdecyf (#4089)
- [dask] add support for eval sets and custom eval functions @ffineis (#4101)
- Add linear leaf models to json output (fixes #4186) @btrotta (#4329)
- [dask] run Dask tests on aarch64 architecture @StrikerRUS (#3996)
- [python] handle arbitrary length feature names in Python-package @StrikerRUS (#4293)
- Precise text file parsing @cyfdecyf (#4081)
- added aliases to params @StrikerRUS (#4205)
- [swig] add wrapper for LGBM_DatasetGetFeatureNames @shuttie (#4103)
🔨 Breaking
- [python] deprecate "auto" value of
ylabel
argument ofplot_metric()
function @StrikerRUS (#4624) - [python] rename
print_evaluation()
intolog_evaluation()
@StrikerRUS (#4604) - [RFC][python] deprecate advanced args of
train()
andcv()
functions and sklearn wrapper @StrikerRUS (#4574) - [RFC][python] deprecate
silent
and standaloneverbose
args. Prefer globalverbose
param @StrikerRUS (#4577) - [python] add 'auto' value for
importance_type
param in plotting @StrikerRUS (#4570) - [dask] Make output of feature contribution predictions for sparse matrices match those from sklearn estimators (fixes #3881) @jameslamb (#4378)
- [R-package] change default nrounds to 100 to match LightGBM core library default @david-cortes (#4197)
🚀 Efficiency Improvement
- simplify and speed up comparisons for splits with identical gains @jameslamb (#4542)
- factor out .size() checks in GetDataType() @jameslamb (#4541)
- consolidate duplicate conditions in TextReader @jameslamb (#4530)
- [python] replace numpy.zeros with numpy.empty for the speedup @StrikerRUS (#4410)
- [R-package] avoid unnecessary computation of std deviations in lgb.cv() @jameslamb (#4360)
- Replace division of exponential in Gamma loss @lorentzenchr (#4289)
🐛 Bug Fixes
- [R-package] fix segfaults caused by missing Booster and Dataset handles (fixes #4208) @jameslamb (#4586)
- move Network method implementations from network.h to network.cpp (fixes #4464) @jameslamb (#4496)
- [R-package] prevent memory leak if pointer fails to allocate @david-cortes (#4613)
- [R-package] Fix R memory leaks (fixes #4282, fixes #3462) @david-cortes (#4597)
- [python][sklearn] respect
eval_at
aliases in keyword arguments @StrikerRUS (#4599) - [dask] Fixed Dask type annotation @StrikerRUS (#4558)
- [R-package] allow construction of Dataset from CSV without header (fixes #4553) @jameslamb (#4554)
- [R-package] fix OpenMP checking on macOS (fixes #4131) @jameslamb (#4507)
- [R-package] pass R-configured compiler flags to checks in configure @jameslamb (#4506)
- [R-package] use C++ compiler for pre-compile checks on Windows @jameslamb (#4504)
- [dask] find all needed ports in each host at once (fixes #4458) @jmoralez (#4498)
- Fix undefined behavior with NaN input in CategoricalDecision() @hcho3 (#4468)
- [dask] determine output shape of array in predict (fixes #4285) @jmoralez (#4351)
- [fix] fix Reservoir Sampling in Sample of random.h (fix #4371 and #4134) @shiyu1994 (#4450)
- [CUDA] fix CUDA memory error by reducing block number (#4315) @RobinDong (#4327)
- [R-package] fix protection stack imbalance and unprotected objects (fixes #4390) @fabsig (#4391)
- [dask] pass additional predict() parameters through when input is a Dask Array @jameslamb (#4399)
- fix param aliases @StrikerRUS (#4387)
- sync for init score of binary objective function @loveclj (#4332)
- Fix undefined behavior in ArrayArgs::Partition() when interval size is 1 (fixes #4272) @kruda (#4280)
- Log warning instead of fatal when parsing float get under/overflow. @cyfdecyf (#4336)
- [fix] fix Sample when sampling only one element (fix #4134) @shiyu1994 (#4324)
- [R-package] move more finalizer logic into C++ side to address memory leaks @jameslamb (#4353)
- [tests][python] fix f-string in test_dask.py @StrikerRUS (#4373)
- [fix] skip empty bins when calculating cnt_in_bin in BinMapper::FindBin (fix #4301) @shiyu1994 (#4325)
- [fix] fix GatherInfoForThresholdNumerical boundary (fix #4286) @shiyu1994 (#4322)
- fix calculation of weighted gamma loss (fixes #4174) @mayer79 (#4283)
- [R-package] prevent symbol lookup conflicts (fixes #4045) @jameslamb (#4155)
- [R-package] avoid misleading warnings when using interaction constraints (fixes #4108) @jameslamb (#4232)
- [fix] Fix bug in data distributed learning with local empty leaf @shiyu1994 (#4185)
- fix: Dataset::CreateValid init fields which saves to binary. @cyfdecyf (#4177)
📖 Documentation
- [docs] add Mars to docs @StrikerRUS (#4616)
- [docs] update link to MinGW-w64 site @StrikerRUS (#4606)
- [docs] add lightgbm_ray to docs @jameslamb (#4584)
- [docs][python] Refer to functions as
callable
in docstrings @StrikerRUS (#4575) - [R-package] fix warnings in demos @jameslamb (#4569)
- [R-package] fix warnings in examples @jameslamb (#4568)
- [python][docs] Refer to string type as
str
in docstrings @StrikerRUS (#4565) - [docs] add José Morales to repo maintainers @StrikerRUS (#4563)
- [docs] update links to SynapseML (former MMLSpark) @StrikerRUS (#4564)
- [python][docs] Refer to string type as
str
and add commas inlist of ...
types @StrikerRUS (#4557) - [docs][python] Improve description of
eval_result
argument inrecord_evaluation()
@StrikerRUS (#4559) - [doc] Add link to Neptune hyperparam tuning guide @Blaizzy (#4529)
- [docs] Update link to
daal4py
in README @StrikerRUS (#4532) - [docs] Add notes in installation guide, including ones about OpenMP @StrikerRUS (#4520)
- [docs] [R-package] use CRAN-style builds when building pkgdown site @jameslamb (#4513)
- [docs] Update link to mlr3-compliant interface in README @StrikerRUS (#4509)
- [docs] document CLI behavior when label_column is omitted @jameslamb (#4485)
- [docs] clarify description of prediction early stopping @StrikerRUS (#4411)
- [docs][python] add versionadded to Sequence class in Python wrapper @StrikerRUS (#4441)
- [docs] add lleaves to README @StrikerRUS (#4431)
- [docs] Add shapash to the list of related projects @StrikerRUS (#4408)
- [docs] update link to LightGBM example in MMLSpark repo @StrikerRUS (#4401)
- [docs][R-package] add authors in R-package description @StrikerRUS (#4395)
- fix: typo in python class _InnerPredictor docstring @cyfdecyf (#4389)
- [dask] Dask Vector types for group, init_score, sample_weights (fixes #4375) @ffineis (#4380)
- [docs] document sanitizers @StrikerRUS (#4365)
- [docs][python] enhance
keep_training_booster
param description @StrikerRUS (#4364) - [docs] add anchor for nightly builds in docs @StrikerRUS (#4366)
- [docs] document how to pass multi-value params from Python and R (fixes #4345) @jameslamb (#4346)
- [docs] make building of C++ tests section collapsable @StrikerRUS (#4340)
- [docs] replace broken mmlspark notebook link in docs @jameslamb (#4303)
- [docs] clarify docs for LGBM_BoosterGetEvalNames and LGBM_BoosterGetEvalCounts (fixes #4264) @jameslamb (#4270)
- [docs][R-package] update docs on C++ interface @jameslamb (#4257)
- [docs][python] update some docs related to custom objective @StrikerRUS (#4245)
- [docs][python][scikit-learn] added note for LGBMRanker @StrikerRUS (#4243)
- [docs] fix broken MS MPI link in Installation Guide @jameslamb (#4224)
- [R-package] clarify parameter documentation (fixes #4193) @jameslamb (#4202)
- [docs][R-package] Update the explanation of num_threads (fixes #4192) @issactoast (#4199)
- [docs] add working dir to R package docker run examples @jameslamb (#4190)
- [docs] fix markdown in docs @StrikerRUS (#4191)
- [docs] Add changes to gcc-tips @akshitadixit (#4187)
- [docs] bring back macOS installation method with Homebrew formula in docs @StrikerRUS (#4182)
🧰 Maintenance
- v3.3.0 release (fixes #4310) @jameslamb (#4633)
- fix possible precision loss in xentropy and fair loss objectives @jameslamb (#4651)
- [tests][python-package] refactor list_to_1d_numpy test to run without pandas installed @jmoralez (#4639)
- [python] add type hints to _safe_call @strobelTha (#4641)
- remove unused DCGCalculator::CalDCGAtK() @jameslamb (#4650)
- [python][sklearn] add
__sklearn_is_fitted__()
method to be better compatible with scikit-learn API @StrikerRUS (#4636) - [ci] Use the latest gcc version in macOS CI jobs @StrikerRUS (#4640)
- remove duplicated debug printing in
CMakeLists.txt
for MPI @StrikerRUS (#4644) - remove unused BinMapper::SizeForSpecificBin() @jameslamb (#4643)
- [ci] ignore certificates for kitware apt channel in CUDA jobs (fixes #4646) @jameslamb (#4648)
- [ci] bump CUDA version from 11.4.0 to 11.4.2 at CI @StrikerRUS (#4628)
- [R-package] introduce Dataset methods set_field() and get_field() @jameslamb (#4571)
- [ci] Recover running CUDA tests at CI (fixed #4611) @shiyu1994 (#4621)
- [ci] Run cmakelint at CI and fix some errors @StrikerRUS (#4617)
- [python] initialize installation options with boolean values in
setup.py
@StrikerRUS (#4620) - [python] fix mypy error in
dask.py
@StrikerRUS (#4615) - [ci] Stop running CUDA tests at CI @StrikerRUS (#4611)
- [R-package] avoid unnecessary computation and add tests for Dataset set_reference() method @jameslamb (#4587)
- [ci] fix link to LightGBM public e-mail @StrikerRUS (#4603)
- [tests][dask] Use workers hostname in tests (fixes #4594) @jmoralez (#4595)
- prefer spaces to tabs in CMakeLists.txt @jameslamb (#4593)
- [ci] skip Dask tests on QEMU builds @jameslamb (#4600)
- [ci] simplify docker info parsing in QEMU builds @StrikerRUS (#4592)
- [ci] explicitly set --platform when running aarch64 image in QEMU builds @jameslamb (#4579)
- [R-package] fix inaccurate error message in Dataset get_colnames() method @jameslamb (#4588)
- [R-package] preserve uses of '...' in Dataset slice() method @jameslamb (#4581)
- [R-package] fix inaccurate comments, remove unnecessary comments @jameslamb (#4582)
- [R-package] deprecate the use of 'info' in Dataset @jameslamb (#4573)
- [R-package] deprecate uses of '...' in Dataset slice() method @jameslamb (#4572)
- [R-package] use {testthat} SummaryReporter in tests @jameslamb (#4567)
- [python] Use double type for
init_score
array when set by predictor @StrikerRUS (#4510) - [ci] upgrade R to 4.1.1 @jameslamb (#4560)
- [python] add type hints on train() in engine.py @jameslamb (#4544)
- [R-package] add deprecation warnings on uses of '...' in predict() and reset_parameter() @jameslamb (#4548)
- [docs] Clarify the fact that predict() on a file does not support saved Datasets (fixes #4034) @jameslamb (#4545)
- [ci] Check for MM_PREFETCH and MM_MALLOC not only in CRAN builds @StrikerRUS (#4540)
- [ci] Add checks that OpenMP is used in R-package builds @StrikerRUS (#4538)
- [ci] Add checks that MM_PREFETCH and MM_MALLOC are used in CRAN builds @StrikerRUS (#4536)
- [python] add type hints to logging functions in basic.py @jameslamb (#4527)
- [python] add type hints in docs/conf.py @jameslamb (#4526)
- [R-package] remove unused '...' in Booster constructor @jameslamb (#4523)
- [R-package] add deprecation warnings about some uses of '...' @jameslamb (#4522)
- [ci] use flag '--allow-releaseinfo-change' in some 'apt-get update' calls @jameslamb (#4524)
- [ci] replace uses of backticks in test.sh with $() @jameslamb (#4519)
- [ci] move Solaris and valgrind test steps into scripts @jameslamb (#4503)
- [tests][dask] reduce number of collisions tests @jmoralez (#4501)
- [R-package] remove unused variable R_SCRIPT in configure.win @jameslamb (#4505)
- Update c_api LGBM_SampleIndices() comment. @cyfdecyf (#4490)
- [R-package] quote path variables in build-cran-package.sh @jameslamb (#4499)
- [python][tests] refactor tests with Sequence input @StrikerRUS (#4495)
- [R-package] limit exported symbols in DLL @jameslamb (#4494)
- [docs][ci] bump versions of R-package dependencies at RTD @StrikerRUS (#4488)
- remove examples/.gitignore @jameslamb (#4486)
- [python] Add type hints to helpers/parameter_generator.py @sagnik1511 (#4474)
- [refactor] Use
CreateSampleIndices()
inc_api.cpp
@cyfdecyf (#4478) - [python] parallelize MinGW make similarly to Unix make command @StrikerRUS (#4462)
- [ci] remove preinstalled possibly conflicting software from PATH in CI jobs @StrikerRUS (#4463)
- [ci] Add CI job running rchk on the R package (fixes #4400) @jameslamb (#4449)
- [python] migrate to pathlib in setup.py and use
absolute()
on paths first @StrikerRUS (#4444) - [ci] add support for 8.0 and 8.6 CUDA archs @StrikerRUS (#4454)
- [tests][python] added tests for early stop in prediction in ranking task @StrikerRUS (#4457)
- [ci] bump CUDA version from 11.2.2 to 11.4.0 at CI @StrikerRUS (#4453)
- [tests] clarify RuntimeError in distributed tests @StrikerRUS (#4452)
- [python-package] use toarray() instead of todense() in tests and examples @jameslamb (#4446)
- [python] migrate to pathlib in distributed tests @StrikerRUS (#4443)
- [python] minor refactoring of Python code @StrikerRUS (#4442)
- [tests][python] refactor file loading routine in C API test @StrikerRUS (#4437)
- [tests] fix deprecation numpy warning @StrikerRUS (#4439)
- [python-package] convert string concatenation to f-strings in test_engine.py (fixes #4136) @jameslamb (#4436)
- [python] migrate to pathlib in python examples @StrikerRUS (#4428)
- [python] migrate to pathlib in helper scripts @StrikerRUS (#4434)
- [tests][cli] distributed training @jmoralez (#4254)
- [python] migrate to pathlib in python tests @StrikerRUS (#4435)
- [python] migrate to f-strings in interactive_plot_example.ipynb @StrikerRUS (#4430)
- [ci] ensure interactive_plot_example notebook is run in interactive mode at CI @StrikerRUS (#4432)
- [ci] add h5 files into
.gitignore
@StrikerRUS (#4429) - [python] migrate to pathlib in conf.py @StrikerRUS (#4427)
- [python-package] f-string format updated in plot_example.py @amanjha8100 (#4421)
- [python] migrate to pathlib in create_nuget.py @StrikerRUS (#4422)
- [python-package] Add type hints to init for LGBMModel @seanytak (#4420)
- [SWIG] fix compiler warning about unused variable in SWIG @StrikerRUS (#4419)
- [tests] fix compiler warning about types conversion in cpp tests @StrikerRUS (#4418)
- [dask] fix typehint on _pad_eval_names() @jameslamb (#4413)
- [python] Add type hints to python-package/lightgbm/plotting.py @WestonKing-Leatham (#4367)
- [tests][dask] add missing compute() in Dask test @jameslamb (#4412)
- [tests][ci] run cpp tests with sanitizers on Linux and macOS @StrikerRUS (#4330)
- [ci] [R-package] increase timeout on valgrind job @jameslamb (#4404)
- [python] Improving the syntax of the fstrings in the file: .\examples\python-guide\advanced_example.py @sayantan1410 (#4386)
- [python] Improving the syntax of
print
s insimple_example.py
andsklearn_example.py
@StrikerRUS (#4396) - [R-package] remove unnecessary comments @jameslamb (#4383)
- [ci] Increase timeout value for QEMU builds @StrikerRUS (#4385)
- [R-package] consolidate duplicate lists of Dataset info keys @jameslamb (#4381)
- [tests] replace pytest.parametrize @StrikerRUS (#4377)
- [ci] [R-package] add unit tests on monotone constraints @jameslamb (#4352)
- [python] add type hints to check_dynamic_dependencies.py @greyhere (#4382)
- [python] add type hints to python-package/setup.py @greyhere (#4376)
- [R-package] remove defaults in internal functions @jameslamb (#4361)
- [python] improving the syntax of the fstring in the file : tests/python_package_test/test_dask.py @sayantan1410 (#4358)
- Updated tests/python_package_test/test_plotting.py to use f-strings @WestonKing-Leatham (#4359)
- [R-package] remove unnecessary library() calls in tests @jameslamb (#4354)
- [python-package] use f-strings for concatenation in examples/python-guide/logistic_regression.py @sagnik1511 (#4356)
- [python-package] updated test_consistency.py to use f-strings @sayantan1410 (#4348)
- [R-package] resolve test warning about is.na() and handles @jameslamb (#4341)
- [R-package] factor out lgb.check.r6.class() @jameslamb (#4343)
- [R-package] remove lgb.last_error() and LGBM_GetLastError_R() @jameslamb (#4344)
- [R-package] remove unused argument in early stopping callback @jameslamb (#4342)
- [R-package] remove uses of ... in Predictor constructor @jameslamb (#4338)
- [R-package] remove unused code in lgb.params2str() @jameslamb (#4337)
- [ci] upgrade R to 4.1.0 in CI @StrikerRUS (#4328)
- [ci] cmake: remove linking to sanitizer library @cyfdecyf (#4176)
- [ci] Increase timeout value for QEMU builds @StrikerRUS (#4326)
- [python] improving the syntax of the fstring in the file : tests/python_package_test/test_basic.py @sayantan1410 (#4312)
- [docs][python] fix LGBMRanker docstring @StrikerRUS (#4306)
- [python] improve error message for required packages @StrikerRUS (#4304)
- [tests][python] Handle data types more accurate in C API test @StrikerRUS (#4297)
- [python-package] Improve Graphviz import error message (fixes #4299) @AngelikaAntsmae (#4302)
- [python] Handle integer types more accurate in Python-to-C interface @StrikerRUS (#4292)
- [python] Improving the syntax of the f-strings in the file: tests/c_api_test/test.py @sayantan1410 (#4294)
- [CUDA] Add CUDA_ARCHITECTURES to fix CMake warnings (#3754) @RobinDong (#4268)
- [R-package] Handle integer types more accurate in R-to-C interface @StrikerRUS (#4291)
- [R-package] suppress Wcast-function-type warning in CMake-based gcc and MinGW builds (fixes #4273) @jameslamb (#4274)
- [python] added f-string to python-package/lightgbm/basic.py @NovusEdge (#4143)
- [python] added f-strings to python-package/lightgbm/dask.py @NovusEdge (#4144)
- [ci] pin dask and distributed in CI jobs @jameslamb (#4288)
- Migrate to f-strings in python-package\lightgbm\plotting.py (#4136) @akshitadixit (#4279)
- [python] added f-strings to helpers/parameter_generator.py @NovusEdge (#4146)
- [python] added f-string to python-package/lightgbm/callback.py @NovusEdge (#4142)
- [R-package] manage Dataset and Booster handles as R external pointers (fixes #3016) @jameslamb (#4265)
- [ci][docs] Unpin Sphinx version @StrikerRUS (#4277)
- [docs] remove extra spaces in comments and docs @jameslamb (#4269)
- [R-package] move creation of character vectors in some methods to C++ side @jameslamb (#4256)
- [ci][docs] Restrict Sphinx version @StrikerRUS (#4267)
- [python] added f-strings to python-package/lightgbm/engine.py @kantajitshaw (#4258)
- fix param name @StrikerRUS (#4253)
- [R-package] Use R standard routines to access character data in C++ @jameslamb (#4252)
- [ci] Delete lock.yml @StrikerRUS (#4251)
- Correct spelling @az0 (#4250)
- [R-package] Use R standard routines to access numeric and integer array data in C++ @jameslamb (#4247)
- [R-package] use R standard routine to access read-only ints passed to C++ @jameslamb (#4246)
- [R-package] move Rinternals.h closer to where it is used @jameslamb (#4248)
- [R-package] Convert LGBM_GetLastError_R to use R built-in types @jameslamb (#4242)
- [R-package] remove pre-allocated call_state in C++ calls @jameslamb (#4244)
- [ci] Install graphviz system-widely @StrikerRUS (#4238)
- show specific error message in TCP accept/send/receive logs @jameslamb (#4128)
- [ci] [python-package] remove unused import in tests @jameslamb (#4233)
- Fix typo in binary file already exists error message. @cyfdecyf (#4231)
- [R-package] fix warnings in unit tests @jameslamb (#4225)
- [python][scikit-learn] change MRO @StrikerRUS (#3192)
- [ci][docs] Unpin Breathe version in requirements.txt @StrikerRUS (#4222)
- [R-package] Move error handling into C++ side @jameslamb (#4163)
- [R-package] fix grammar in comments @david-cortes (#4215)
- [dask] Fix typo mentioned in 4101 @ffineis (#4214)
- [ci] parallelize R package installs in CI jobs @jameslamb (#4198)
- [python] Migrate to f-strings in python-package/lightgbm/sklearn.py @akshitadixit (#4188)
- [R-package] Make returned feature importances from lgb.importance() visible by default @david-cortes (#4194)
- [ci] run cpp tests at CI @StrikerRUS (#4166)
- [ci] unpin CMake version for CUDA + Clang toolchain @StrikerRUS (#4183)
- [ci] Restore CUDA jobs at CI @StrikerRUS (#4172)
- [ci] Bump version for development @StrikerRUS (#4171)