What's Changed
- log warning when filter conversion/bind expression fail by @huaxingao in #5254
- Build: Upgrade test dependencies to latest version by @XN137 in #5210
- Update License Header by @nastra in #5265
- Flink: port #4943 to flink 1.13 and 1.14 by @chenjunjiedada in #5263
- Build: unify github action versions by @XN137 in #5211
- Build: Exclude unnecessary git properties from iceberg-build.properties by @singhpk234 in #5277
- only log error message(not the stack trace) if filter conversion/bind expression fail by @huaxingao in #5274
- [SPARK] Update Spark 2.4 JMH Benchmark Instructions to Updated Module Name iceberg-spark-2.4 by @kbendick in #5189
- Dell: Fix bugs during documenting by @wang-x-xia in #5059
- typos depracated to deprecated by @20100507 in #5285
- Build: Fix Scala 2.13 builds in stage-binaries.sh by @rdblue in #5270
- Build: Add iceberg-build.properties to RAT excludes by @rdblue in #5262
- AWS: Dynamo Catalog: Pass CommitFailedException up the stack without wrapping by @waifairer in #5299
- Build: Use Google Java Format for spotless by @nastra in #5266
- AWS: handle s3 and glue exceptions more gracefully as user errors by @xingfanx in #5304
- AWS: Reduce TestS3FileIO prefix test scale factors. Replace with S3FileIO integration tests. by @amogh-jahagirdar in #5289
- Prevent usage of @test(expected = ...) and change existing tests by @nastra in #5221
- API: CloseableIterable.concat evaluates first item twice by @nastra in #5306
- API: Introduce DefaultMetricsContext and Timer interface by @nastra in #5286
- Docs: Fix Flink Connector docs with custom catalog by @nastra in #5045
- Spark: Correct SparkCatalog javadoc for supplying custom catalog by @hrishisd in #5288
- StreamingDelete constructor can be called by subclasses by @gustavoatt in #5271
- #5308 - Avoid 2 reads on manifest by @palaniappa in #5309
- Build: Upgrade slf4j to 1.7.36 by @nastra in #5320
- Java: Catch NPE if the type isn't set by @Fokko in #5291
- Build: Upgrade Guava to 31.1-jre by @nastra in #5322
- Core: Add MetadataLog metadata table by @singhpk234 in #5063
- Core: Implement BaseMetastoreCatalog.registerTable() by @Mehul2500 in #5037
- Spec: Add sequence-number and parent-snapshot-id by @Fokko in #5196
- Build: Let revapi compare API compatibility against apache-iceberg-0.14.0 by @nastra in #5336
- Build: Use 'apache-iceberg-' tag prefix to figure out SNAPSHOT version by @nastra in #5341
- Ignore case for partition transform by @southernriver in #5335
- AWS: Fix #2796 - avoid S3 error of "Resetting to invalid mark" by re-creating input stream on retries by @jfz in #5282
- Bump zstandard from 0.17.0 to 0.18.0 in /python by @dependabot in #5342
- Core: Print DateTime strings always with +00:00 Zone offset by @nastra in #5337
- API: Deprecate Counter#count() / Add Counter#value() by @nastra in #5328
- AWS: Add LakeFormation Integration tests by @xiaoxuandev in #4423
- More Python Expressions by @CircArgs in #5258
- Core: Support _deleted metadata column in vectorized read by @flyrain in #4888
- Make glue endpoint configurable #5095 by @naushadh in #5330
- typo mulitple -> multiple by @20100507 in #5354
- Add table spec changes for statistics information in table snapshot by @findepi in #4945
- Core: Support building custom tasks in ManifestGroup by @aokolnychyi in #5301
- Substitute the hard code string for a constants by @zhaomin1423 in #5347
- Improve bloom filter test by @huaxingao in #5329
- Spark: Adopt the new Scan Task APIs in Spark Readers by @flyrain in #5248
- unit test should always verify mock invocation by @abmo-x in #5317
- AWS - Fix ErrorProne warning of malformed Javadoc by @kbendick in #5359
- ICEBERG-4346: Better handling of Orphan files by @karuppayya in #4652
- Spark 3.2: Support different task types in readers by @flyrain in #5363
- Build: Quicker project evaluation (memoize project version) by @snazy in #5051
- Nessie: Do not delete default branch in tests by @snazy in #5193
- Core: Add base implementations for changelog tasks by @aokolnychyi in #5300
- Build: Enforce spotless & spotlessApply by @nastra in #5312
- Update spec.md to fix broken link to parquet-format/blob/master/LogicalTypes by @skadyan in #5352
- Flink: bridge the gap btw FlinkSource and IcebergSource (FLIP-27) and… by @stevenzwu in #5318
- Flink: port PR #5318 to 1.14 by @stevenzwu in #5344
- Parquet: Add option to set page row count limit by @bryanck in #5345
- AWS: Fix S3FileIO#prefixList integration test by @amogh-jahagirdar in #5383
- Spark 3.3: Add prefix mismatch mode for deleting orphan files by @karuppayya in #5385
- Core: Remove TestEnvironmentUtil#testEnvironmentSubstitution() as it is bui… by @stevenzwu in #5353
- API: Track name/unit in Counters/Timers by @nastra in #5386
- Flink: avoid converting Iceberg MetricContext to Flink metrics in FLI… by @stevenzwu in #5393
- Bump fastavro from 1.5.3 to 1.5.4 in /python by @dependabot in #5396
- Github: Add issue form by @Fokko in #4867
- Infra - Add a GH Action to Mark and Close Stale Issues by @kbendick in #4949
- Flink: Support write options in the in-line insert SQL comments by @hililiwei in #5050
- AWS: Call abortUpload only once when any of the completable future fails by @singhpk234 in #5366
- Spark - Add Spark FunctionCatalog by @kbendick in #5377
- API: Assign the right field ids when merging schema, #5394 by @karuppayya in #5395
- [Spark] - Backport FunctionCatalog to Spark 3.2 by @kbendick in #5411
- Nessie: Bump to 0.40.3 by @snazy in #5406
- S3OutputStream - failure to close should persist on subsequent close calls by @abmo-x in #5311
- [Core | Docs]: [FOLLOWUP] Add metadata_log_entries metadata table by @singhpk234 in #5367
- AWS: S3FileIo Integration test include UUID prefix in Prefix integration tests by @amogh-jahagirdar in #5413
- Core: Implement default value parsing and unparsing by @rzhang10 in #4871
- Infra - Upgrade Stale GH Action to Latest 5.1.1 by @kbendick in #5420
- API/Core: Initial Table Scan Reporting support by @nastra in #5268
- Core: Simplify scan planning & reporting tests by @nastra in #5428
- Hive: Fix concurrent transactions overwriting commits by adding hive lock heartbeats. by @SinghAsDev in #5036
- Hive: Hadoop Path fails on s3 endpoint by @Fokko in #5405
- API: changes to honour schema filed name's case by @karuppayya in #5440
- Spark: Spark changes to honour schema filed name's case by @karuppayya in #5441
- Core: Implement IncrementalChangelogScan without deletes by @aokolnychyi in #5382
- Core, API: Add getting refs and snapshot by ref to the Table API by @amogh-jahagirdar in #4428
- Flink: missed IcebergSourceReader group in PR #5393 for FLIP-27 source reader metrics by @stevenzwu in #5401
- API: Avoid unnecessary wrapping of CloseableIterable.iterator() by @nastra in #5446
- Doc: update Flink doc for using the new experimental FLIP-27 source by @stevenzwu in #5423
- Spark 3.3: Delete file counts while deleting reachable files by @aokolnychyi in #5451
- Bump coverage from 6.4.2 to 6.4.3 in /python by @dependabot in #5454
- Move base.py for table to init by @samredai in #5458
- CI: Enable dependabot for Github Actions by @Fokko in #5429
- API: Improve/align error messaging in CloseableIterable/CloseableIterator by @nastra in #5433
- Bump actions/setup-python from 3 to 4 by @dependabot in #5460
- CI: Enable dependabot for gradle by @nastra in #5464
- AWS: Cleanup warning about Lambda should be method reference by @amogh-jahagirdar in #5476
- Bump nebula.dependency-recommender from 9.0.2 to 11.0.0 by @dependabot in #5474
- Bump pyarrow from 8.0.0 to 9.0.0 in /python by @dependabot in #5455
- Core: Partition filter pushdown for entries table by @szehon-ho in #5443
- AWS: Fix/Suppress ErrorProne warnings by @nastra in #5368
- Bump hiveVersion from 3.1.2 to 3.1.3 by @dependabot in #5470
- Doc: Update web page of Flink unit test (#5480) by @lvyanquan in #5484
- Spark 3.3: Use typed beans in BaseSparkAction by @aokolnychyi in #5469
- Core: Prevent potential NPEs when retrieving JSON fields by @nastra in #5438
- Spark 3.2: Count delete files in DeleteReachableFiles by @aokolnychyi in #5491
- Spark 3.2: Use typed beans in BaseSparkAction by @aokolnychyi in #5494
- AWS: Use executor service by default when performing batch deletion of files by @amogh-jahagirdar in #5379
- Core, API: Performing operations on a snapshot branch ref by @namrathamyske in #4926
- Spark: Support truncate in FunctionCatalog by @kbendick in #5431
- Spark 3.1:Port #3721 to Spark 3.1 by @hililiwei in #5497
- Spark 3.1:Port #3287 #4381 #3535 #4419 to Spark 3.1 by @hililiwei in #5498
- Spark 3.1:Port #4198 to Spark 3.1 by @hililiwei in #5499
- Spark 3.1:Port #3373 to Spark 3.1 by @hililiwei in #5500
- Spark 3.1:Port #3491 to Spark 3.1 by @hililiwei in #5502
- Spark 3.1:Port #3456 to Spark 3.1 by @hililiwei in #5501
- Spark 3.2: Support truncate in FunctionCatalog by @kbendick in #5514
- Build: Bump spotless-plugin-gradle from 6.8.0 to 6.9.1 by @dependabot in #5521
- Build: Bump pydantic from 1.9.1 to 1.9.2 in /python by @dependabot in #5522
- SetSnapshotOperation should commit empty operations too by @szlta in #5536
- AWS: Support preload S3 client mode for S3FileIO by @xiaoxuandev in #5508
- Build: Resolve unchecked Map type cast in TestAvroNameMapping by @JonasJ-ap in #5541
- Fix linter and test failures by @samredai in #5542
- Flink 1.13&1.14: Port #5050 to Flink 1.13&1.14 by @hililiwei in #5531
- API: Deprecate generic Counter and replace with simpler Counter API by @nastra in #5505
- API/Core: Scan reporting result wrappers and parsers by @nastra in #5427
- Build: Bump gradle-git-version from 0.12.3 to 0.15.0 by @nastra in #5532
- Core: Put property names at the end in JsonUtil error messages by @nastra in #5434
- Replace deprecated Counter with new Counter API by @nastra in #5506
- Spark 3.1:Port #3505 to Spark 3.1 by @hililiwei in #5503
- Flink - Suppress Nanosecond Warning for TimestampTz ORC writer by @kbendick in #5552
- Core: Add some tests for JsonUtil & reduce duplicated code by @nastra in #5526
- Add s3.acceleration-enbled flag to AwsProperties by @price-qian in #5555
- Spark 3.3: Reduce serialization in DeleteOrphanFilesSparkAction by @aokolnychyi in #5495
- API: Remove counter name by @nastra in #5559
- Spark 3.3 - Support bucket in FunctionCatalog by @kbendick in #5513
- Spark 3.2: Support bucket in FunctionCatalog by @kbendick in #5571
- Core: Don't clear snapshot log when intermediate snapshots are detected by @nastra in #5568
- Flink: fix the bug where metrics are registered in split reader. Also updated reader metric group to be more consistent with Flink metrics style. by @stevenzwu in #5554
- Spark 3.3: Align formatting in bucket and truncate functions by @aokolnychyi in #5573
- Spark 3.2: Reduce serialization in DeleteOrphanFilesSparkAction by @aokolnychyi in #5572
- ORC: Upgrade to 1.7.6 by @williamhyun in #5580
- Spark 3.2: Delete deprecated action classes by @aokolnychyi in #5575
- Build: Bump gradle-processors from 3.3.0 to 3.7.0 by @dependabot in #5582
- Spark 3.2: Align formatting in bucket and truncate functions by @kbendick in #5574
- Core: Make a shorthand for the rest catalog by @Fokko in #5570
- Flink: add monitor metrics for Flink sink by @stevenzwu in #5410
- API: add Histogram metric type by @stevenzwu in #5348
- Flink: port PR #5410 to 1.14 for sink monitoring metrics by @stevenzwu in #5589
- Build: Bump jackson-annotations from 2.6.5 to 2.13.3 by @dependabot in #5596
- Build: Bump coverage from 6.4.3 to 6.4.4 in /python by @dependabot in #5599
- Add Fokko as a collaborator by @Fokko in #5600
- Build: enforce LambdaMethodReference check at compile-time by @XN137 in #5529
- API: Extend FileIO in optional interfaces by @aokolnychyi in #5576
- Flink - Fix Malformed Inline Tag in ContinuousSplitPlannerImpl JavaDoc by @kbendick in #5551
- Core: Add expression JSON parser by @rdblue in #5602
- Deps: Bump AWS SDK by @Fokko in #5612
- Build: Bump tezVersion from 0.10.1 to 0.10.2 by @dependabot in #5520
- Docs: Flink
Streaming upsert write
by @hililiwei in #5380 - Fix message pattern in checkArgument invocation by @findepi in #5621
- Core: Add snapshot references metadata table by @rajarshisarkar in #4807
- Add table metadata changes for statistics information in table metadata by @findepi in #5450
- Docs: Added missing doc for REPLACE PARTITION FIELD by @dotjdk in #5624
- Core: Transform parquet bloom filter props when updating schema. by @zhongyujiang in #5426
- AWS: Deprecate AwsClientFactories.s3Configuration() by @price-qian in #5592
- Add SparkV2Filters by @huaxingao in #5302
- API: Deprecate old incremental append scans by @aokolnychyi in #5577
- Remove deprecations for Rollback and Overwrite Files by @danielcweeks in #5639
- Core: Use Bulk Delete when dropping table data and metadata by @amogh-jahagirdar in #5459
- Deprecations for 1.0 release: MR properties by @danielcweeks in #5657
- Deprecations for 1.0 release: Aliyun OSS by @danielcweeks in #5654
- Build: Bump spotless-plugin-gradle from 6.9.1 to 6.10.0 by @dependabot in #5650
- Docs: Switch post- and pre- around by @Fokko in #5633
- Deprecations for 1.0 release: remove dynamo lock manager and props by @danielcweeks in #5655
- AWS: Add s3.dualstack-enabled flag to AwsProperties by @JonasJ-ap in #5644
- Add API changes for statistics information in table metadata by @findepi in #5021
- AWS: fix the wrong flag used for s3UseArnRegionEnabled by @JonasJ-ap in #5680
- Spark: Add Changelog reader for copy-on-write by @flyrain in #5578
- Spark 3.2: Add row-based changelog reader by @flyrain in #5682
- [Python] FsspecFileIO, a FileIO that wraps any fsspec compliant filesystem by @samredai in #5332
- Core: Fix exception handling in BaseTaskWriter by @rdblue in #5683
- Support delete corrupted Iceberg table by @yabola in #5510
- [Core | Spark | Integrations] : Fix kryo serialization failure for FileIO by @singhpk234 in #5437
- Parquet: close zstd input stream early to avoid memory pressure by @bryanck in #5681
- Spark: Fix stats in rewrite metadata action by @rdblue in #5691
- Docs: Update docs to reflect AWS SDK version presently being used by @singhpk234 in #5661
- Doc: Update doc to display the results of the table partitions query by @lvyanquan in #5662
- Core: Add CommitStateUnknownException handling to REST by @rdblue in #5694
- API: Remove source type from Transform by @rdblue in #5601
- Spark: Add custom metric for number of deletes applied by a SparkScan by @wypoon in #4588
- Flink: fix missing generic types for some IcebergSource$Builder methods by @stevenzwu in #5697
- API/Core: Include Expression filter in ScanReport by @nastra in #5705
- Bump avro from 1.9.2/1.10.2 to 1.11.1 by @nastra in #5483
- Core: Avoid useless metadata retries. by @rdblue in #5696
- Build: Bump pytest from 7.1.2 to 7.1.3 in /python by @dependabot in #5703
- Build: Bump jackson-annotations from 2.13.3 to 2.13.4 by @dependabot in #5702
- Build: Bump jmh-gradle-plugin from 0.6.6 to 0.6.7 by @dependabot in #5700
- Update ORC to 1.8.0 by @williamhyun in #5699
- Build: Enforce logging conventions with errorprone by @XN137 in #5528
- Build: Upgrade to Gradle 7.5.1 by @XN137 in #5278
- Flink: Fixed an issue where Flink batch entry was not accurate by @xuzhiwen1255 in #5642
- Dell: Add document. by @wang-x-xia in #4993
- Nessie: Prevent accidental deletion of files which are still referenced by other branches/tags by @ajantha-bhat in #5718
- Flink: Fixed an issue where Flink1.14 batch entry was not accurate by @xuzhiwen1255 in #5716
- API: Add rowsCount to ScanTask by @aokolnychyi in #5720
- Docs: Add snapshot references metadata table by @rajarshisarkar in #5725
- Flink: Fixed an issue where Flink1.13 batch entry was not accurate by @xuzhiwen1255 in #5731
- Build: Bump fastavro from 1.6.0 to 1.6.1 in /python by @dependabot in #5745
- Build: Bump pydantic from 1.10.1 to 1.10.2 in /python by @dependabot in #5744
- API: Use hashCode instead of hash by @Fokko in #5751
- AWS: Preload S3 client in GlueCatalog For LakeFormation enabled tables by @xiaoxuandev in #5756
- Build - Remove unused global flink dependency from versions.props by @kbendick in #5758
- Build - Move global Spark 2.4 dependency in version.props to Spark 2.4 subproject by @kbendick in #5759
- CI: Fix names and jobs by @Fokko in #5749
- JdbcCatalog don't override namespace location if set by @danielcweeks in #5737
- Spark: Fix runtime jars packaging scala library files by @ajantha-bhat in #5754
- Build: relocate httpclient5 dependency for runtime jars by @ajantha-bhat in #5761
- AWS: Refactor util methods for applying AWS clients configurations by @JonasJ-ap in #5684
- Bump actions/stale from 5.1.1 to 5.2.0 by @dependabot in #5785
- Bump spotless-plugin-gradle from 6.10.0 to 6.11.0 by @dependabot in #5786
- PyArrow support for S3/S3A with properties by @joshuarobinson in #5747
- REST: implement handling of OAuth error responses by @bryanck in #5698
- AWS: Allow users to set the assume role session name by @JonasJ-ap in #5765
- Revert "REST: implement handling of OAuth error responses (#5698)" by @danielcweeks in #5810
- Flink 1.14&1.15 backport: Set custom Hadoop configuration by @lvyanquan in #5775
- API/Core: Make ScanReport and its related classes Immutable by @nastra in #5780
- API: Remove unneeded class variable by @Fokko in #5805
- Core: Serialize statistics files in TableMetadata by @findepi in #5799
- Core: Reduce duplicated code in JSON Parsers by @nastra in #5802
- API,Core: Add scan planning metrics for skipped data/delete files by @nastra in #5788
- Github: Update issue template with latest release by @Fokko in #5818
- Core: Use JsonUtil.generate in ErrorResponseParser by @nastra in #5816
- Build: Fix CI paths by @Fokko in #5821
- Build: Add the path to the Action yaml by @Fokko in #5828
- Build: Apply spotless on integration modules as well by @nastra in #5827
- Don't check row filter when deciding whether to copy data file with stats by @manuzhang in #5815
- Add a BoundBooleanExpressionVisitor for visiting bound expressions by @samredai in #5303
- REST: implement handling of OAuth error responses followup by @bryanck in #5820
- Add REST Servlet/Server Implementations by @danielcweeks in #5781
- AWS: update AWS Integration Test to fix false positives by @JonasJ-ap in #5784
- API/Core: Remove deprecated methods from Snapshot API by @nastra in #5734
- Build: Bump Rat to 0.15 by @Fokko in #5839
- AWS: Add socket connection timeout for Apache Http Builder by @JonasJ-ap in #5787
- Core: Add strict-mode property to JDBC Catalog by @nastra in #5830
- core: Provide mechanism to cache manifest file content by @rizaon in #4518
- Support setting table statistics by @findepi in #5794
- Core: Ignore TestManifestCaching#testWeakFileIOReferenceCleanUp untl it's fixed by @nastra in #5865
- Spark: Fix MERGE INTO Query failure on tables with non-nullable columns by @singhpk234 in #5679
- Docs: Make it clear metadata tables support time travel in Spark by @liuml07 in #4709
- Doc: Update output of expire_snapshots procedure by @lvyanquan in #5866
- Ensure the default value of hive.in.test to avoid overwriting by @viirya in #5844
- API: Extended some deprecation comments in API folder by @gaborkaszab in #5726
- Core: Deprecate functions in TableMetadata and DataWriter by @gaborkaszab in #5772
- Core: Deprecate functions in DeleteWriters by @gaborkaszab in #5771
- AWS: Add table and namespace S3 tags by @rajarshisarkar in #4402
- Core: Avoid extra getFileStatus call in HadoopInputFile by @singhpk234 in #5864
- Orc: Closes #5777 - Obtain ORC stripe offsets from writer by @pavibhai in #5778
- Flink: add defensive check in IcebergFilesCommitter for restoring state by @stevenzwu in #5873
- Spark 3.x: Backport snapshot references metadata table test by @rajarshisarkar in #5806
- Build: Fix & Run spark integration tests on CI by @nastra in #5819
- API,Core: Add scan planning metrics for scanned/skipped delete manifests by @nastra in #5792
- Doc: Update the default value of table property
read.parquet.vectorization.enabled
by @Kontinuation in #5776 - Bump Nessie to 0.43.0 by @snazy in #5807
- Doc: Update default values of Lock catalog properties to avoid wrong way of filling. by @lvyanquan in #5708
- Build: Bump hadoop-client from 3.1.0 to 3.3.4 by @dependabot in #5519
- Spark: Bump Spark version for vulnerability by @deadwind4 in #5292
- Expose table statistics in Table API by @findepi in #4741
- [Python][Docs] Very small formatting fix by @samredai in #5868
- Build: workflows cache gradle wrapper by @XN137 in #4165
- API,Core: Add scan planning metrics for indexed/eq/pos delete files by @nastra in #5809
- Build: Bump gradle-baseline-java from 4.0.0 to 4.42.0 by @nastra in #5530
- Docs: Add table and namespace S3 tags doc by @rajarshisarkar in #5894
- Retain table statistics during orphan files removal by @findepi in #5795
- [Docs] Update drop table behavior in spark-ddl docs by @sumeetgajjar in #5645
- Spark 3.3: Fix failing jmh benchmarks under org.apache.iceberg.spark.data.parquet package by @sumeetgajjar in #5635
- Core: Only validate the current partition specs by @Fokko in #5707
- Core: Add RESTScanReporter to send scan report to REST endpoint by @nastra in #5407
- Build: Bump jmh-gradle-plugin from 0.6.7 to 0.6.8 by @dependabot in #5850
- Build: Bump actions/stale from 5.2.0 to 6.0.0 by @dependabot in #5851
- Build: Bump jinja2 from 3.0.3 to 3.1.2 in /python by @dependabot in #5849
- Build: Bump coverage from 6.4.4 to 6.5.0 in /python by @dependabot in #5904
- Build: Bump rich from 12.5.1 to 12.6.0 in /python by @dependabot in #5905
- Build: Bump pytest-checkdocs from 2.7.1 to 2.8.1 in /python by @dependabot in #5903
- Core: Rename misleading local variable in planFiles() by @gaborkaszab in #5889
- INFRA: Avoid running engine tests on ISSUE_TEMPLATE update by @singhpk234 in #5859
- Core, API: Support scanning from refs by @amogh-jahagirdar in #5364
- Spark: Set the version explicitly by @Fokko in #5907
- API: Make COUNT default unit when creating a Counter by @nastra in #5912
- Core: Reuse PositionDelete by @nastra in #5896
- Spark 3.3: Fix nullability in merge-on-read projections by @aokolnychyi in #5880
- Spark 3.2: Fix nullability in merge-on-read projections by @aokolnychyi in #5917
- Replace & Ban ExpectedException usage by @nastra in #5921
- API: Handle negative/zero during num-digits calculation by @nastra in #5928
- Core: Provide better error message on invalid enums by @nastra in #5910
- Reduce 'Scanning table' log verbosity for long IN list by @findepi in #5908
- Core: Deprecate write.manifest-lists.enabled flag by @nastra in #5773
- Spark 3.3: Add SparkChangelogTable by @aokolnychyi in #5740
- Core: Add dataSequenceNumber to ManifestEntry by @aokolnychyi in #5913
- AWS: Add socket connection timeout for UrlConnectionHttpClient by @JonasJ-ap in #5900
- AWS: Add additional configurations for ApacheHttpClientBuilder by @JonasJ-ap in #5899
- Docs: Add doc for HTTP client configurations by @JonasJ-ap in #5902
- Build: Bump actions/stale from 6.0.0 to 6.0.1 by @dependabot in #5940
- Build: Bump pytest-checkdocs from 2.8.1 to 2.9.0 in /python by @dependabot in #5941
- Core: Deflake TestManifestCaching.testWeakFileIOReferenceCleanUp by @rizaon in #5862
- AWS: Fix NotSerializableException when using AssumeRoleAwsClientFactory in Spark by @JonasJ-ap in #5939
- API: Provide better error message for invalid FileFormat enum by @nastra in #5918
- Api: Optimize the code by @linfey90 in #5733
- Docs: the table name should be the same as sql create table name by @mggger in #5962
- Core: Make testEnvironmentSubstitution effective when USER is not set by @dimas-b in #5770
- API: Fix estimated row count in ContentScanTask by @wypoon in #5755
- Core: Clear queue and future task when close ParallelIterable by @Heltman in #5887
- Core: Expire Snapshots reachability analysis by @amogh-jahagirdar in #5669
- Spark 3.3: Split SparkScan and SparkBatch by @aokolnychyi in #5934
- Core/Spark: Fix kryo deserialization of
SerializableTable
by @Kontinuation in #5975 - Flink: revise unit test of FlinkUpsert so the table is partitioned by date by @lvyanquan in #5486
- Spark: Improve performance of expire snapshot by not double-scanning retained Snapshots by @szehon-ho in #3457
- Docs: Fix incorrect glue catalog class name for Hive by @singhpk234 in #5973
- Core: Fix confusing log from RemoveSnapshots by @ajantha-bhat in #5478
- API: Add BatchScan to Table by @aokolnychyi in #5922
- Docs: Typo in loading table from DataFrameReader by @szehon-ho in #5978
- Api: Fix transforms.day() returns a format document and javadoc by @xuzhiwen1255 in #5980
- AwsProperties prints format specifier in IllegalArgumentException message by @szlta in #5995
- Spark: Fix DATE_ADD expression in IcebergSourceFlatParquetDataWriteBenchmark by @dramaticlly in #5991
- Support performing merge appends and delete files on branches by @amogh-jahagirdar in #5618
- Bump Nessie from 0.43.0 to 0.44.0 by @snazy in #6008
- Doc: Fix typos related to date transforms by @fb913bf0de288ba84fe98f7a23d35edfdb22381 in #5992
- Spark: Remove backup table after a successful migrate action. by @sririshindra in #5622
- Core: Fix NPE for parent snapshot does not exist by @hililiwei in #6005
- Flink: Fix NoClassDefFound with Flink runtime jar / Add integration test by @nastra in #6001
- Spark 3.2: Use ScanTaskGroup methods when computing stats by @aokolnychyi in #6011
- Spark 3.2: Add SparkChangelogTable by @aokolnychyi in #6013
- Spark 3.2: Remove redundant imports in SparkScan by @aokolnychyi in #6016
- Core: Fix TestSnapshotUtil time random disorder by @hililiwei in #6015
- Spark 3.2: Split SparkScan and SparkBatch by @aokolnychyi in #6014
- Core: Parallelize the determining of reachable manifests during file cleanup by @amogh-jahagirdar in #5981
- Orc: Support row group bloom filters by @deadwind4 in #5313
- Core,Spark: Refactor to move "copy-on-write" and "merge-on-read" literals to constants by @gaborkaszab in #6006
- [python_legacy] BOTO_STS_CLIENT lazy initialization by @puchengy in #5930
- Core: Don't fail scan planning if REST metric reporting fails by @nastra in #6023
- Nessie: no longer push whole metadata JSON to Nessie by @snazy in #5999
- Core: Deprecate HTTPClientFactory / Allow configuring ObjectMapper for HTTPClient by @nastra in #5998
- Closes #5988 - Allow configuration of Hive MetastoreClient using Catalog properties by @pavibhai in #5989
- docs:Add an example of CTAS with PARTITIONED BY (rebased, fix #3854) by @samredai in #6020
- Hive: Set the Table owner on table creation by @gaborkaszab in #5763
- Replace Assert.fail usage with AssertJ fluent testing by @nastra in #6029
- Replace and ban hamcrest usage by @nastra in #6030
- API: Update expression sanitization for relative dates and times by @rdblue in #5944
- Core: Rename TableTestBase.Assertions to not conflict with AssertJ Assertions by @nastra in #6022
- Add section on semantic versioning and deprecations by @danielcweeks in #6032
- Core: Increase inferred column metrics limit to 100 by @rdblue in #5916
- Build: Bump mkdocs from 1.3.1 to 1.4.1 in /python by @dependabot in #6033
- API,Core: Move ScanReport to core module / extract TimerResult/CounterResult/ScanMetricsResult into own classes by @nastra in #6037
- Spark 3.3: Ensure rowStartPosInBatch in ColumnarBatchReader is set correctly by @wypoon in #6026
- Spark 3.2: Ensure rowStartPosInBatch in ColumnarBatchReader is set correctly by @wypoon in #6041
- add Aggregate Expressions by @huaxingao in #5961
- Flink: Add Sink options to override the compression properties of the Table by @pvary in #6049
- Core: Add file seq number to ManifestEntry by @aokolnychyi in #6002
- Spark 3.1: Ensure rowStartPosInBatch in ColumnarBatchReader is set correctly by @wypoon in #6046
- Core: Replace projected Schema with schemaId/fieldIds/fieldNames in ScanReport by @nastra in #6047
- Spark 3.2: #6041 follow-up/cleanup by @wypoon in #6063
- Build: Update Spark to 3.3.1 by @wangyum in #5783
- Build: Bump pytest from 7.1.3 to 7.2.0 in /python by @dependabot in #6080
- Build: Bump pyarrow from 9.0.0 to 10.0.0 in /python by @dependabot in #6081
- Build: Bump zstandard from 0.18.0 to 0.19.0 in /python by @dependabot in #6082
- PyArrow should convert timestamps to microseconds. by @joshuarobinson in #6070
- Spark 3.3: Use separate scan during file filtering in copy-on-write operations by @aokolnychyi in #6077
- Spark: Remove redundant check for max_concurrent_deletes in spark actions by @ajantha-bhat in #6083
- Infra: Publish nightly build for Spark-3.3_2.13 by @ajantha-bhat in #6054
- Infra: Update slack invite link by @ajantha-bhat in #6052
- Docs: Fix link in the Java Custom Catalog page by @Jonathan-Rosenberg in #6068
- Infra: Add 1.0.0 in issue template dropdown by @ajantha-bhat in #6057
- Flink: Remove Flink 1.13 by @hililiwei in #6103
- Core,Spark: Fix raw generics usage of ManifestWriter by @nastra in #6059
- Spark 3.2: Use separate scan during file filtering in copy-on-write ops by @aokolnychyi in #6095
- Spark 3.3: Relocate all Netty dependencies by @aokolnychyi in #6107
- Spark 3.2: Relocate all Netty classes by @aokolnychyi in #6109
- Spark: Optimize Preconditions.checkArgument in procedures by @ajantha-bhat in #6096
- Docs: Update spotless apply command for non-default versions by @ajantha-bhat in #6101
- Core: Improve collection handling in JsonUtil by @nastra in #6051
- Build: Add gaborkaszab as a collaborator by @gaborkaszab in #6036
- Flink: Add support for Flink 1.16 by @hililiwei in #6092
- Core: Avoid reading ManifestFile when create ManifestReader by @ConeyLiu in #5632
- Struct fields should be provided to Schema constructor by @ddrinka in #6115
- Remove Fokko from the list of collaborators by @Fokko in #6119
- Use Java collections in AwsProperties to fix Kryo serialization. by @jfz in #5812
- [Docs] Update migrate behaviour with respect to drop_table in spark-procedures docs. by @sririshindra in #6025
- [Core | Spark] Strip trailing slash from custom metadatalocation by @singhpk234 in #6121
- Build: Bump mkdocs from 1.4.1 to 1.4.2 in /python by @dependabot in #6130
- API: Hash floats -0.0 and 0.0 to the same bucket by @fb913bf0de288ba84fe98f7a23d35edfdb22381 in #6110
- Spark-3.0: Remove/update spark-3.0 mention from Docs and Builds by @ajantha-bhat in #6093
- Support 2-level list and maps type in RemoveIds. by @SinghAsDev in #6064
- Fix TestAggregateBinding by @huaxingao in #6065
- SparkBatchQueryScan logs too much - #6106 by @Omega359 in #6108
- Fix typo in
_ManifestEvalVisitor.visit_equal
by @ddrinka in #6117 - Flink: Optimize test code of TestSourceUtil by @lvyanquan in #6143
- Spark-3.0: Remove spark/v3.0 folder by @ajantha-bhat in #6094
- Fixes read metadata table failed due to illegal character by @ConeyLiu in #4577
- Core: Pass purgeRequested flag to REST server by @nastra in #6073
- Build: Let revapi compare API compatibility against apache-iceberg-1.0.0 by @ajantha-bhat in #6053
- Core: Rename HMS_TABLE_OWNER to follow naming convention by @gaborkaszab in #6154
- Docs: Update spotless apply command by @lvyanquan in #6157
- Nessie: Use unique path for different table with same name by @ajantha-bhat in #4826
- Spark Integration to read from Snapshot ref by @namrathamyske in #5150
- Cache dropStats result for ManifestReader iterator by @manuzhang in #5836
- Core: Reduce code duplication around writing JSON collections by @nastra in #6113
- Core: Sync client/server properties in REST catalog by @rdblue in #6150
- Flink: Port #6049 to Flink 1.14 to add Sink options of compression properties by @lvyanquan in #6166
- Build: Bump jackson-annotations from 2.13.4 to 2.14.0 by @dependabot in #6129
- Build: Add -DallVersions property that exposes all component versions by @nastra in #6167
- Core,Spark: Add metadata to Scan Report by @nastra in #6058
- Fix typo in unused python iceberg paramter by @alec-heif in #6173
- AWS: Fix catalog names in LakeFormationTestBase by @aajisaka in #5767
- Spark: Backport setting the EnvironmentContext for Spark by @nastra in #6183
- Flink: Add engine name/version to EnvironmentContext by @nastra in #6184
- Core: Add Iceberg version to EnvironmentContext by @nastra in #6185
- Core: Add a util method to combine tasks by partition by @sunchao in #2276
- Spark: Fix QueryFailure when running RewriteManifestProcedure on Date partitioned table by @singhpk234 in #5860
- Build: Enable revapi on core/parquet/orc/common/data modules & fix API breaks by @nastra in #6146
- Spark 3.3: Preserve file seq numbers while rewriting manifests by @aokolnychyi in #6176
- Docs: fix link of
Write options
in Flink by @lvyanquan in #6191 - Core: Remove unused toTaskGroupStream from TableScanUtil by @sunchao in #6189
- Spark 3.2: Preserve file seq numbers while rewriting manifests by @aokolnychyi in #6192
- Spark 3.1: Preserve file seq numbers while rewriting manifests by @aokolnychyi in #6193
- Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog by @lvyanquan in #6111
- Core: Method for building grouping key type by @aokolnychyi in #6163
- Revert "Hive: Forward catalog-specific Hive configuration properties … by @pavibhai in #6187
- Core: Add time zone info to LocalDate in ExpressionUtil tests by @nastra in #6200
- REST: Assign metadata UUID on create transaction by @bryanck in #6201
- API, Core: Move micros and days conversions to DateTimeUtil by @aokolnychyi in #6199
- Core: Remove redundant initialization by @krvikash in #6178
- Extract Flink package version programmatically for EnvironmentContext… by @stevenzwu in #6206
- Flink: Add unit test for FlinkPackage util class by @stevenzwu in #6213
- Parquet: Fixes get null values for the nested field partition column by @ConeyLiu in #4627
- API: Make the PartitionSpec less lazy by @Fokko in #6220
- Spark: Add missing override by @Fokko in #6227
- API: Ignore case when comparing truncate by @Fokko in #6226
- Release: Fix the version template by @Fokko in #6195
- Replace ImmutableMap.Builder.build() with buildOrThrow() by @krvikash in #6212
- Allow dropping a column used by old SortOrders but not current SortOrder by @islamismailov in #6211
- Nessie: Refactor NessieTableOperations#doCommit by @ajantha-bhat in #6240
- API: Restore the type of the identity transform by @Fokko in #6242
New Contributors
- @waifairer made their first contribution in #5299
- @hrishisd made their first contribution in #5288
- @palaniappa made their first contribution in #5309
- @Mehul2500 made their first contribution in #5037
- @naushadh made their first contribution in #5330
- @abmo-x made their first contribution in #5317
- @skadyan made their first contribution in #5352
- @lvyanquan made their first contribution in #5484
- @namrathamyske made their first contribution in #4926
- @price-qian made their first contribution in #5555
- @dotjdk made their first contribution in #5624
- @yabola made their first contribution in #5510
- @xuzhiwen1255 made their first contribution in #5642
- @joshuarobinson made their first contribution in #5717
- @rizaon made their first contribution in #4518
- @viirya made their first contribution in #5844
- @gaborkaszab made their first contribution in #5726
- @pavibhai made their first contribution in #5778
- @Kontinuation made their first contribution in #5776
- @linfey90 made their first contribution in #5733
- @mggger made their first contribution in #5962
- @Heltman made their first contribution in #5887
- @fb913bf0de288ba84fe98f7a23d35edfdb22381 made their first contribution in #5992
- @wangyum made their first contribution in #5783
- @Jonathan-Rosenberg made their first contribution in #6068
- @ddrinka made their first contribution in #6115
- @Omega359 made their first contribution in #6108
- @hendrikmakait made their first contribution in #6135
- @foarsitter made their first contribution in #6158
- @alec-heif made their first contribution in #6173
- @aajisaka made their first contribution in #5767
- @krvikash made their first contribution in #6178
- @LuigiCerone made their first contribution in #6159
- @islamismailov made their first contribution in #6211
Full Changelog: apache-iceberg-0.14.0...apache-iceberg-1.1.0