What's Changed
- Bump to Java 11 by @Fokko in #3314
- Add comparator for
UnknownLogicalTypeby @Fokko in #3292 - Allow reading dictionary encoded boolean by @Fokko in #3370
- GH-2815: Allow bytestreamsplit available via Hadoop Configuration by @ArnavBalyan in #3340
- GH-2836: Support reading pure parquet files with cat by @ArnavBalyan in #3332
- GH-2891: Include actual values in validation error messages and improve logging by @ArnavBalyan in #3319
- GH-2961: Cycle detection in AvroSchemaConverter to prevent infinite recursion by @ArnavBalyan in #3272
- GH-2967: Support unified config options for convert parquet-cli by @ArnavBalyan in #3283
- GH-2972: Fix incomplete avro metadata on INT96 schema converter by @ArnavBalyan in #3311
- GH-3149: Enable ParquetAvroReader to handle decimal types for int32/64 by @ArnavBalyan in #3306
- GH-3175: support protobuf library version 4 by @uwemaurer in #3352
- GH-3213: Add the configuration for ByteStreamSplit encoding by @joeyutong in #3214
- GH-3224: Make ParquetProperties.valuesWriterFactory thread safe by @ArnavBalyan in #3308
- GH-3267: Add comprehensive assertions to TestMemPageStore by @ArnavBalyan in #3268
- GH-3273: Add scoped chunk level statistics to avoid unbounded output by @ArnavBalyan in #3274
- GH-3286: Add support for Parquet-Protobuf in Parquet-cli by @ArnavBalyan in #3287
- GH-3290: Restore Snapshot versions for vector/benchmark modules by @ArnavBalyan in #3288
- GH-3294: Include optional profiles for release process by @ArnavBalyan in #3297
- GH-3298: Support unified file based configurations for CLI by @ArnavBalyan in #3304
- GH-3300: add ParquetWriter and ParquetReader builders constructor without params by @jerolba in #3301
- GH-3310: Clean up JIRA references and move to GH issues by @ArnavBalyan in #3309
- GH-3312: Support uuid read converter for parquet thrift by @ArnavBalyan in #3313
- GH-3315: Variant binary read does not take length into account by @jerolba in #3333
- GH-3316: Fix representation type for VariantBuilder decimal by @ArnavBalyan in #3335
- GH-3317: Fix bytes written by VariantBuilder.appendFloat by @ArnavBalyan in #3334
- GH-3320: Ensure parquet reader does not fail due to incorrect statistics by @ArnavBalyan in #3325
- GH-3321 Exclude package-info.class from shaded fastutil dependency by @jerolba in #3322
- GH-3327: Bug fix incorrect compressed size reported by DataPageV1 by @ArnavBalyan in #3326
- GH-3331: Track Column index page skip statistics during file read by @ArnavBalyan in #3330
- GH-3338: Support encrypted files for Parquet CLI commands by @ArnavBalyan in #3339
- GH-3350: Avoid flushing data to cloud when exception is thrown by @Jiayi-Wang-db in #3351
- GH-3358: Add Configurable Thrift Max Message Size for Parquet Metadata Reading by @cravani in #3359
- MINOR: Bump avro.version from 1.11.4 to 1.11.5 by @gszadovszky in #3348
- MINOR: Bump version to 1.17.0-SNAPSHOT by @ArnavBalyan in #3293
- MINOR: Post release of 1.16.0 by @wgtmac in #3305
- MINOR: Remove unused parquet-thrift dependencies by @dossett in #3323
- MINOR: [parquet-column] remove unused test dependencies by @dossett in #3324
- MINOR: parquet-avro tests should not debug to stderr by @dossett in #3329
- docs: Replace JIRA with GitHub Issues by @Fokko in #3303
- Bump com.google.guava:guava from 33.4.0-jre to 33.5.0-jre by @dependabot[bot] in #3366
- Bump commons-io:commons-io from 2.18.0 to 2.21.0 by @dependabot[bot] in #3369
- Bump easymock 5.6.0 to support Java 25 by @pan3793 in #3363
- Bump protobuf.version from 3.25.6 to 4.30.2 by @dependabot[bot] in #3182
- Bump protobuf.version from 4.33.1 to 4.33.2 by @dependabot[bot] in #3373
New Contributors
- @ArnavBalyan made their first contribution in #3288
- @Jiayi-Wang-db made their first contribution in #3351
- @joeyutong made their first contribution in #3214
- @uwemaurer made their first contribution in #3352
Full Changelog: apache-parquet-1.16.0-rc0...apache-parquet-1.17.0-rc0