Caveats
⚠️ For platforms without PyArrow 6 support (e.g. MWAA, EMR, Glue PySpark Job):
➡️pip install pyarrow==2 awswrangler
Breaking changes
- Fix sanitize methods to align with Glue/Hive naming conventions #579
New Functionalities
- AWS Lake Formation Governed Tables 🚀 #570
- Support for Python 3.10 🔥 #973
- Add partitioning to JSON datasets #962
- Add ability to use unbuffered cursor for large MySQL datasets #928
Enhancements
- Add awswrangler.s3.list_buckets #997
- Add partitions_parameters to catalog partitions methods #1035
- Refactor pagination config in list objects #955
- Add error message to EmptyDataframe exception #991
Documentation
- Clarify docs & add tutorial on schema evolution for CSV datasets #964
Bug Fix
- catalog.add_column() without column_comment triggers exception #1017
- catalog.create_parquet_table Key in dictionary does not always exist #998
- Fix Catalog StorageDescriptor get #969
Thanks
We thank the following contributors/users for their work on this release:
@csabz09, @Falydoor, @moritzkoerber, @maxispeicher, @kukushking, @jaidisido
P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run or use them from our S3 public bucket!