What's Changed
🎉 Exciting New Features
Incremental data integration in batch pipelines
🥳 Data integrations in batch pipelines now support incremental replication! You can read more here to get started!
by @tommydangerous in #4068
[Streaming] RabbitMQ Destination
Another community PR from @shrutimantri adds support for RabbitMQ as a streaming data sink. 🔥
Check it out today with your favorite streaming sources! You can find the configuration reference here.
by @shrutimantri in #4041
Chroma integration
Mage now has a ChromaDB IO Class, meaning you can use data loaders and exporters in your batch pipelines to read/write from Chroma sources. You can read more about configuration here or visit Chroma's site to learn more about their vector database.
by @matrixstone in #4017
Bookmark overrides
🎊 If you're creating a trigger on a data integration, you can now override bookmarks with your own custom values!
by @tommydangerous in #4073
SQL Block environment variable interpolation
For our fans of SQL blocks, you can now interpolate environment variables directly in your queries!
SELECT
'{{ env_var("ENV") }}' AS test
, '{{ variables("test") }}' AS test2
, {{ test }} AS test3
This should allow for much greater flexibility in pipelines with SQL!
by @tommydangerous in #4076
Additional upstream dependencies for dynamic children
Love dynamic blocks? 🤔 They dynamic children can now have additional upstream dependencies!
by @tommydangerous in #4104
Support caching block output in memory
Previously, pipelines with large Spark DataFrames faced out of heap space errors when persisting block outputs to disk. This PR allows the user to disable persisting output. The feature is only supported in standard batch pipeline (without dynamic blocks) for now.
cache_block_output_in_memory: true
run_pipeline_in_one_process: true
by @wangxiaoyou1993 in #4127
🐛 Bug Fixes
- Backend API for getting information about bookmarks by @tommydangerous in #4070
- Support different operators when comparing bookmark properties by @tommydangerous in #4075
- Update backfill statuses by @johnson-mage in #3994
- Reduce block at any level UI by @tommydangerous in #4067
- Catch execption of empty integration streams in pipeline scheduler by @wangxiaoyou1993 in #4083
- Backfill's date-picker date value mismatch by @edmondwinston in #3972
- Gracefully access dictionaries in the Oauth Policy by @tommydangerous in #4086
- Pass tolerations to job pod by @wangxiaoyou1993 in #4089
- Fix load sample data for integration pipelines by @dy46 in #4034
- Default to using environment variables for git and workspace settings by @dy46 in #4088
- Fixed Google Ads Source by @Luishfs in #4099
- Fix chromadb dependency by @wangxiaoyou1993 in #4107
- Fix chromadb in all package by @wangxiaoyou1993 in #4108
- Update local timezone project setting from header by @johnson-mage in #4111
- Fix runtime variables not showing when creating new trigger by @tommydangerous in #4116
- Fix executing conditional blocks with pipeline executor by @wangxiaoyou1993 in #4120
- Remove itertools groupby by @dy46 in #4103
- Updates/nats add stream fixes by @mfreeman451 in #4113
- Update
opentelemetry-exporter-prometheus
package version by @dy46 in #4101 - Fix postgres streaming sink when there are no messages by @shrutimantri in #4074
💅 Enhancements & Polish
- Add any runtime variables by @tommydangerous in #4071
- Include
message_events_json
in Postmark messages_outbound stream by @wangxiaoyou1993 in #4085 - Hide "Unique" and "Key" columns for certain data integration destination blocks by @johnson-mage in #4096
- Add top padding to file code editor by @johnson-mage in #4098
- Display error in UI when variables directories configured incorrectly by @johnson-mage in #4091
- Improve the interface for Chroma class by @wangxiaoyou1993 in #4110
- Add
column_header_format
option by @dy46 in #4118 - Allow configuring Amplitude host by @wangxiaoyou1993 in #4060
New Contributors
- @andrewgetzdata made their first contribution in #4078
- @suvhotta made their first contribution in #4097
Full Changelog: 0.9.46...0.9.48