Added
-
Enable reads and writes of dataframes from/to external file systems.
It supports HTTP(s) URLs or AWS S3 locations.
This feature introduces the FSS abstraction,
which is also going to be present in newer versions of Kino. This is going to make the integration
of Livebook files with Explorer much easier.The implementation is done differently, depending on which file format is used, and if
it's a read or write. All the writes to AWS S3 are done in the Rust side - using an abstraction
calledCloudWriter
-, and most of the readers are implemented in Elixir, by doing a download
of the files, and then loading the dataframe from it. The only exception is the reads of
parquet files, which are done in Rust, using Polars'scan_parquet
with streaming.We want to give a special thanks to Qqwy / Marten for the
CloudWriter
implementation! -
Add ADBC: Arrow Database Connectivity.
Continuing with improvements in the IO area, we added support for reading dataframes from
databases using ADBC, which is similar in idea to ODBC, but integrates much better with
Apache Arrow, that is the backbone of Polars - our backend today.The function
Explorer.DataFrame.from_query/1
is the entrypoint for this feature, and it
allows quering databases like PostgreSQL, SQLite and Snowflake.Check the Elixir ADBC bindings docs for more information.
For the this feature, we had a fundamental contribution from Cocoa
in the ADBC bindings, so we want to say a special thanks to her!We want to thank the people that joined José in his live streamings on Twitch,
and helped to build this feature! -
Add the following functions to
Explorer.Series
: -
Add duration dtypes. This is adds the following dtypes:
{:duration, :nanosecond}
{:duration, :microsecond}
{:duration, :millisecond}
This feature was a great contribution from Billy Lanchantin,
and we want to thank him for this!
Changed
-
Return exception structs instead of strings for all IO operation errors, and for anything
that returns an error from the NIF integration.This change makes easier to define which type of error we want to raise.
-
Update Polars to v0.32.
With that we made some minor API changes, like changing some options for
cut/qcut
operations
in theExplorer.Series
module. -
Use
nil_values
instead ofnull_character
for IO operations. -
Never expect
nil
for CSV IO dtypes. -
Rename
Explorer.DataFrame.table/2
toExplorer.DataFrame.print/2
. -
Change
:datetime
dtype to be{:datetime, time_unit}
, where time unit can be
the following::millisecond
:microsecond
:nanosecond
-
Rename the following
Series
functions:trim/1
tostrip/2
trim_leading/1
tolstrip/2
trim_trailing/1
torstrip/2
These functions now support a string argument.
Fixed
-
Fix warnings for the upcoming Elixir v1.16.
-
Fix
Explorer.Series.abs/1
type specs. -
Allow comparison of strings with categories.
-
Fix
Explorer.Series.is_nan/1
inside the context ofExplorer.Query
.
The NIF function was not being exported.
Pull requests
- Starting FSS abstraction by @philss in #645
- ADBC support in from_query by @josevalim in #648
- Add FSS abstraction to remaining
from_*
IO functions by @philss in #649 - Fix citation for UCI datasets (wine and iris) by @firefly-cpp in #650
- Bump arrow2 version by @sasikumar87 in #654
- Return exception instead of string by @Jhonatannunessilva in #651
- Update polars to v0.30 by @sasikumar87 in #656
- Update Polars to v0.31 by @philss in #659
- Read parquet files from AWS S3 by @philss in #652
- Add necessary cargo features by @philss in #661
- Normalise IO write functions with FSS entry by @philss in #662
- implement trim/2 by @DeemoONeill in #664
- Example to dump a dataframe in CSV format to a S3-compatible object store by @Qqwy in #653
- Write parquet from eager dataframe to S3 by @philss in #665
- Add Series.window_median by @spatchkaa in #670
- Use
nil_values
instead ofnull_character
in IO operations by @cnpryer in #667 - Add remaining write operations to AWS S3 by @philss in #671
- Never expect
nil
for CSV IOdtypes
by @cnpryer in #672 - implement slice_string/3 by @DeemoONeill in #669
- Rename trim functions by @DeemoONeill in #674
- Change :datetime dtype to be {:datetime, time_unit} by @sasikumar87 in #675
- Correct typo in Series by @pgeraghty in #676
- Make download of CSV files from S3 work for "DF.from_csv/2" by @philss in #677
- Read parquet from AWS S3 - more formats by @philss in #678
- Read files from HTTP by @philss in #679
- Use FSS as a dependency by @philss in #681
- Bump rustls-webpki from 0.101.1 to 0.101.4 in /native/explorer by @dependabot in #684
- Return errors as exceptions by @josevalim in #686
- Fix FSS.S3 support when bucket is not provided by @philss in #682
- Return errors as exception structs for FSS integration by @philss in #688
- Add duration datatypes by @billylanchantin in #683
- Convert numeric values to series early by @josevalim in #685
- Mention boolean series and how they can be used by @kellyfelkins in #692
- Fixes a documentation typo by @kellyfelkins in #693
- Prepare for the v0.7 release by @philss in #695
New Contributors
- @firefly-cpp made their first contribution in #650
- @DeemoONeill made their first contribution in #664
- @Qqwy made their first contribution in #653
- @spatchkaa made their first contribution in #670
- @cnpryer made their first contribution in #667
- @dependabot made their first contribution in #684
- @billylanchantin made their first contribution in #683
- @kellyfelkins made their first contribution in #692
Full Diff: v0.6.1...v0.7.0
Changelog: https://github.com/elixir-explorer/explorer/blob/main/CHANGELOG.md
SHA256 checksums
a4629f950187fd20f4b0efa0164e8e9e20b5799312688e4ce7d82c46e28dfbaa explorer-v0.7.0-nif-2.15-x86_64-pc-windows-gnu.dll.tar.gz
9028f61dcde0e3d95ca886463d78cdaf0d749a319f0c721a11321947f86017d7 explorer-v0.7.0-nif-2.15-x86_64-pc-windows-msvc.dll.tar.gz
5a244343ad99310267531c7848c9f261944064770e90748c22c0dd17fa12867a libexplorer-v0.7.0-nif-2.15-aarch64-apple-darwin.so.tar.gz
a72dae3b58b11d73a47f3a92137b53494e52b1454c413e9ec3877d4eb7d9e406 libexplorer-v0.7.0-nif-2.15-aarch64-unknown-linux-gnu.so.tar.gz
c87661de40d2447d90c7962dd61da98481f9cf9fa75a923e8699a216d0786a18 libexplorer-v0.7.0-nif-2.15-aarch64-unknown-linux-musl.so.tar.gz
0c5e602fb5e2680b916c07cb6615812e15e6f9fe28b361866341f8b4fd5cffe7 libexplorer-v0.7.0-nif-2.15-riscv64gc-unknown-linux-gnu.so.tar.gz
02f1638a7309133a72f8079560b06223ffd2d266dece3ddfe25ec0646ddfa23d libexplorer-v0.7.0-nif-2.15-x86_64-apple-darwin.so.tar.gz
64bfae13b65e18b29a891820930ebbfac56abba01cf5b515b1fe11cf8019820b libexplorer-v0.7.0-nif-2.15-x86_64-unknown-freebsd.so.tar.gz
f0296c139c68fa818f61bbc37c5cf293a4fc09bbd4ccf1d942e8566e89146c51 libexplorer-v0.7.0-nif-2.15-x86_64-unknown-linux-gnu.so.tar.gz
d2105e54fa5ab677b9b2e4313371108a9a2df355af659a73c486a55111fd29b4 libexplorer-v0.7.0-nif-2.15-x86_64-unknown-linux-musl.so.tar.gz