github aws/aws-sdk-pandas 2.14.0
AWS Data Wrangler 2.14.0

latest releases: 3.10.0, 3.9.1, 3.9.0...
2 years ago

Caveats

⚠️ For platforms without PyArrow 6 support (e.g. MWAA, EMR, Glue PySpark Job):

➡️ pip install pyarrow==2 awswrangler

New Functionalities

  • Support Athena Unload 🚀 #1038

Enhancements

  • Add the ExcludeColumnSchema=True argument to the glue.get_partitions call to reduce response size #1094
  • Add PyArrow flavor argument to write_parquet via pyarrow_additional_kwargs #1057
  • Add rename_duplicate_columns and handle_duplicate_columns flag to sanitize_dataframe_columns_names method #1124
  • Add timestamp_as_object argument to all database read_sql_table methods #1130
  • Add ignore_null to read_parquet_metadata method #1125

Documentation

  • Improve documentation on installing SAR Lambda layers with the CDK #1097
  • Fix broken link to tutorial in to_parquet method #1058

Bug Fix

  • Ensure that partition locations retrieved from AWS Glue always end in a "/" #1094
  • Fix bucketing overflow issue in Athena #1086

Thanks

We thank the following contributors/users for their work on this release:

@dennyau, @kailukowiak, @lucasmo, @moykeen, @RigoIce, @vlieven, @kepler, @mdavis-xyz, @ConstantinoSchillebeeckx, @kukushking, @jaidisido


P.S. The AWS Lambda Layer file (.zip) and the AWS Glue file (.whl) are available below. Just upload it and run or use them from our S3 public bucket!

Don't miss a new aws-sdk-pandas release

NewReleases is sending notifications on new releases.