What's Changed
- GH-2943: Remove hadoop-2 support by @steveloughran in #3061
- MINOR: Use
exec-maven-plugin.version
property by @Fokko in #3047 - MINOR: Add shading for JDK22 specific classes by @Fokko in #3081
- MINOR: Revert
buildnumber-maven-plugin
to 3.2.0 by @Fokko in #3082 - GH-3086: Allow for empty beans by @Fokko in #3087
- GH-3089: Add missing license header to pom.xml by @raulcd in #3090
- GH-3078: Use Hadoop FileSystem.openFile() to open files by @steveloughran in #3079
- MINOR: Bump version to 1.16.0-SNAPSHOT by @wgtmac in #3097
- Bump org.codehaus.mojo:exec-maven-plugin from 3.3.0 to 3.5.0 by @dependabot[bot] in #3092
- Bump commons-logging:commons-logging from 1.3.3 to 1.3.4 by @dependabot[bot] in #3094
- Bump net.openhft:zero-allocation-hashing from 0.26ea0 to 0.27ea0 by @dependabot[bot] in #3093
- Bump com.google.api.grpc:proto-google-common-protos from 2.41.0 to 2.50.0 by @dependabot[bot] in #3109
- Bump jackson.version from 2.18.1 to 2.18.2 by @dependabot[bot] in #3108
- MINOR: Remove
scala
properties frompom.xml
by @Fokko in #3104 - GH-3114: Fix LogicalType conversions for nested records on Avro <= 1.8 by @clairemcginty in #3111
- Bump com.google.truth.extensions:truth-proto-extension from 1.4.3 to 1.4.4 by @dependabot[bot] in #3107
- Bump org.cyclonedx:cyclonedx-maven-plugin from 2.8.0 to 2.9.1 by @dependabot[bot] in #3120
- Bump org.apache.commons:commons-text from 1.12.0 to 1.13.0 by @dependabot[bot] in #3119
- MINOR: Remove Joda as a direct dependency by @Fokko in #3132
- Bump org.easymock:easymock from 5.4.0 to 5.5.0 by @dependabot[bot] in #3131
- GH-3099 add libthrift to parquet-cli shaded jar by @Arnaud-Nauwynck in #3100
- GH-3127: Enabled
parquet.hadoop.vectored.io.enabled
by default by @dongjoon-hyun in #3128 - GH-3123: Omit level histogram for some max levels by @wgtmac in #3124
- GH-3133: Fix SizeStatistics to handle omitted histogram by @wgtmac in #3134
- GH-3125: Add CLI for SizeStatistics by @wgtmac in #3126
- GH-3115-Fix int96 read issue in complex type by @pratyush-sharma-2025 in #3118
- MINOR: Remove
parquet-tools
fromNOTICE
by @Fokko in #3140 - Bump com.google.guava:guava from 33.2.1-jre to 33.4.0-jre by @dependabot[bot] in #3137
- Bump protobuf.version from 3.25.5 to 3.25.6 by @dependabot[bot] in #3138
- MINOR: Improve exception message in InternalFileDecryptor by @zhongyujiang in #3143
- Bump com.google.api.grpc:proto-google-common-protos from 2.50.0 to 2.51.0 by @dependabot[bot] in #3151
- MINOR: Remove release script by @Fokko in #3144
- Deprecate Apache Pig integration by @Fokko in #3153
- Bump com.h2database:h2 from 2.3.230 to 2.3.232 by @dependabot[bot] in #3158
- Bump commons-logging:commons-logging from 1.3.4 to 1.3.5 by @dependabot[bot] in #3159
- Add logical type annotation for
UnknownType
by @Fokko in #3154 - GH-3156: Enable vectored IO by default. by @ahmarsuhail in #3155
- Bump it.unimi.dsi:fastutil from 8.5.13 to 8.5.15 by @dependabot[bot] in #3162
- GH-3122: Correct V2 page header compression fields for zero-size data pages by @ConeyLiu in #3148
- GH-3163: Reduce memory and time overhead of ParquetRewriterTests by @rahulketch in #3164
- MINOR: Reader fails fast when footer size is larger than INT_MAX by @ConeyLiu in #3136
- GH-3168: Restrict trusted packages in the parquet-avro module by @wgtmac in #3169
- GH-3172: Do not drop blocks with some null values if
DictionaryFilter
is applied forUserDefinedPredicate
which keeps null values by @ebartkus in #3173 - Bump jackson.version from 2.18.2 to 2.18.3 by @dependabot[bot] in #3170
- MINOR: update latest version to 1.15.1 by @wgtmac in #3179
- Bump com.google.api.grpc:proto-google-common-protos from 2.51.0 to 2.54.1 by @dependabot[bot] in #3177
- Bump Parquet Format to 2.11 by @Fokko in #3181
- [MINOR] Enable jitpack.io repo only when brotli is required by @pan3793 in #3180
- Minor: Use logicaltypes constants in ParquetMetadataConverter by @aihuaxu in #3186
- GH-3188: Set the global configured column stats enable flag to default by @huaxiangsun in #3189
- GH-3070: Add Variant logical type annotation to parquet-java by @aihuaxu in #3072
- GH-3116: Implement the decoding of Variant values by @gene-db in #3197
- GH-3198: Allow specifying trusted classes by class name by @gszadovszky in #3199
- PARQUET-2417: Add
geometry
andgeography
logical type annotations by @zhangfengcdt in #3200 - MINOR: Fix display of logicalTypeAnnotation for parquet cli by @pan3793 in #3184
- GH-3203: HadoopPositionOutputStream.close() to call FSDataOutputStream.flush() by @steveloughran in #3204
- GH-3201: Implement a Variant builder to create Variant values by @gene-db in #3202
- GH-3207: ParquetFileReader supports detachFileInputStream by @pan3793 in #3208
- GH-3205: Make HadoopPositionOutputStream.close() safe to call even if closed by @dominicso in #3206
- Bump com.github.luben:zstd-jni from 1.5.6-6 to 1.5.7-3 by @dependabot[bot] in #3209
- PARQUET-2417: Add statistics support to geometry logical type by @zhangfengcdt in #2971
- Bump com.github.siom79.japicmp:japicmp-maven-plugin from 0.21.0 to 0.23.1 by @dependabot[bot] in #3218
- GH-3211: Implement Variant parquet reader by @cashmand in #3212
- MINOR: Update BoundingBox for Empty and Antimeridian Handling by @zhangfengcdt in #3222
- Bump com.fasterxml.jackson.core:jackson-databind from 2.18.3 to 2.19.0 by @dependabot[bot] in #3225
- GH-3233: Parquet CLI supports version command by @pan3793 in #3234
- GH-3235: Row count limit for each row group by @pan3793 in #3236
- Bump org.apache.commons:commons-text from 1.13.0 to 1.13.1 by @dependabot[bot] in #3240
- MINOR: replace avro test configuration by @zheguang in #3244
- GH-3223: Implement Variant parquet writer by @cashmand in #3221
- GH-3249: Fix incorrect Bloom filter data when reading from ByteArrayInputStream by using readFully() by @wangyum in #3250
- GH-3239: Improve ByteBufferReadable detection in HadoopStream by @wangyum in #3259
- Bump com.google.api.grpc:proto-google-common-protos from 2.54.1 to 2.59.2 by @dependabot[bot] in #3256
- GH-3263: Add DictionaryPage.decode to allow dictionary reuse in the ColumnReaderBase ctor by @pyckle in #3264
- GH-3253: Apply ServicesResourceTransformer to parquet-jackson by @pan3793 in #3260
- minor: Cleanup some small bits in a test by @Fokko in #3265
- GH-3141: Add constructor to
ParquetFileReader
to allow passing in parquet footer and expose setRequestedSchema that acceptsList<ColumnDescriptor>
by @pan3793 in #3262 - Bump actions/setup-java from 4 to 5 by @dependabot[bot] in #3276
- Bump jackson.version from 2.19.0 to 2.19.2 by @dependabot[bot] in #3266
- MINOR: Bump thrift to 0.22.0 by @vinooganesh in #3229
- MINOR: Bump parquet-format to 2.12.0 by @wgtmac in #3285
New Contributors
- @raulcd made their first contribution in #3090
- @Arnaud-Nauwynck made their first contribution in #3100
- @pratyush-sharma-2025 made their first contribution in #3118
- @ahmarsuhail made their first contribution in #3155
- @rahulketch made their first contribution in #3164
- @ebartkus made their first contribution in #3173
- @aihuaxu made their first contribution in #3186
- @huaxiangsun made their first contribution in #3189
- @gene-db made their first contribution in #3197
- @zhangfengcdt made their first contribution in #3200
- @dominicso made their first contribution in #3206
- @cashmand made their first contribution in #3212
- @zheguang made their first contribution in #3244
- @pyckle made their first contribution in #3264
Full Changelog: apache-parquet-1.15.2...apache-parquet-1.16.0-rc0