Workspace management
You can use Mage with multiple workspaces in the cloud now. Mage has a built in workspace manager that can be enabled in production. This feature is similar to the multi-development environments, but there are settings that can be shared across the workspaces. For example, the project owner can set workspace level permissions for users. The current additional features supported are:
- workspace level permissions
- workspace level git settings
Upcoming features:
- common workspace metadata file
- customizable permissions and roles
- pipeline level permissions
Doc: https://docs.mage.ai/developing-in-the-cloud/workspaces/overview
Pipeline monitoring dashboard
Add "Overview" page to dashboard providing summary of pipeline run metrics and failures.
Version control application
Support all Git operations through UI. Authenticate with GitHub then pull from a remote repository, push local changes to a remote repository, and create pull requests for a remote repository.
Doc: https://docs.mage.ai/production/data-sync/github
New Relic monitoring
- Set the
ENABLE_NEW_RELIC
environment variable to enable or disable new relic monitoring. - User need to follow new relic guide to create configuration file with license_key and app name.
Doc: https://docs.mage.ai/production/observability/newrelic
Authentication
Active Directory OAuth
Enable signing in with Microsoft Active Directory account in Mage.
Doc: https://docs.mage.ai/production/authentication/microsoft
LDAP
https://docs.mage.ai/production/authentication/overview#ldap
- Update default LDAP user access from editor to no access. Add an environment variable
LDAP_DEFAULT_ACCESS
so that the default access can be customized.
Add option to sync from Git on server start
There are two ways to configure Mage to sync from Git on server start
- Toggle
Sync on server start up
option in Git settings UI - Set
GIT_SYNC_ON_START
environment variable (options: 0 or 1)
Doc: https://docs.mage.ai/production/data-sync/git#git-settings-as-environment-variables
Data integration pipeline
Mode Analytics Source
Shout out to Mohamad Balouza for his contribution of adding the Mode Analytics source to Mage data integration pipeline.
OracleDB Destination
MinIO support for S3 in Data integrations pipeline
Support using S3 source to connect to MinIO by configuring the aws_endpoint
in the config.
Bug fixes and improvements
- Snowflake: Use
TIMESTAMP_TZ
as column type for snowflake datetime column. - BigQuery: Not require key file for BigQuery source and destination. When Mage is deployed on GCP, it can use the service account to authenticate.
- Google Cloud Storage: Allow authenticating with Google Cloud Storage using service account
- MySQL
- Fix inserting DOUBLE columns into MySQL destination
- Fix comparing datetime bookmark column in MySQL source
- Use backticks to wrap column name in MySQL
- MongoDB source: Add authSource and authMechanism options for MongoDB source.
- Salesforce source: Fix loading sample data for Salesforce source
- Improve visibility into non-functioning "test connection" and "load sample data" features for integration pipelines:
- Show unsupported error is "Test connection" is not implemented for an integration source.
- Update error messaging for "Load sample data" to let user know that it may not be supported for the currently selected integration source.
- Interpolate pipeline name and UUID in data integration pipelines. Doc: https://docs.mage.ai/data-integrations/configuration#variable-names
SQL block
OracleDB Loader Block
Added OracleDB Data Loader block
Bug fixes
- MSSQL: Fix MSSQL sql block schema.
Schema
was not properly set when checking table existence. Usedbo
as the default schema if no schema is set. - Trino: Fix inserting datetime column into Trino
- BigQuery: Throw exception in BigQuery SQL block
- ClickHouse: Support automatic table creation for ClickHouse data exporter
DBT block
DBT ClickHouse
Shout out to Daesgar for his contribution of adding support running ClickHouse DBT models in Mage.
Add DBT generic command block
Add a DBT block that can run any generic command
Bug fixes and improvements
- Fix bug: Running DBT block preview would sometimes not use sample limit amount.
- Fix bug: Existing upstream block would get overwritten when adding a dbt block with a ref to that existing upstream block.
- Fix bug: Duplicate upstream block added when new block contains upstream block ref and upstream block already exists.
- Use UTF-8 encoding when logging output from DBT blocks.
Notebook improvements
-
Turn on output to logs when running a single block in the notebook
-
When running a block in the notebook, provide an option to only run the upstream blocks that haven’t been executed successfully.
-
Change the color of a custom block from the UI.
-
Show what pipelines are using a particular block
- Show block settings in the sidekick when selecting a block
- Show which pipelines a block is used in
- Create a block cache class that stores block to pipeline mapping
-
Enhanced pipeline settings page and block settings page
- Edit pipeline and block executor type and interpolate
- Edit pipeline and block retry config from the UI
- Edit block name and color from block settings
-
Enhance dependency tree node to show callbacks, conditionals, and extensions
-
Save trigger from UI to code
Cloud deployment
-
Allow setting service account name for k8s executor
- Example k8s executor config:
k8s_executor_config: resource_limits: cpu: 1000m memory: 2048Mi resource_requests: cpu: 500m memory: 1024Mi service_account_name: custom_service_account_name
-
Support customizing the timeout seconds in GCP cloud run config.
- Example config
gcp_cloud_run_config: path_to_credentials_json_file: "/path/to/credentials_json_file" project_id: project_id timeout_seconds: 600
-
Check ECS task status after running the task.
Streaming pipeline
- Fix copy output in streaming pipeline. Catch deepcopy error (
TypeError: cannot pickle '_thread.lock' object in the deepcopy from the handle_batch_events_recursively
) and fallback to copy method.
Spark pipeline
- Fix an issue with setting custom Spark pipeline config.
- Fix testing Spark DataFrame. Pass the correct Spark DataFrame to the test method.
Other bug fixes & polish
-
Add json value macro. Example usage:
"{{ json_value(aws_secret_var('test_secret_key_value'), 'k1') }}"
-
Allow slashes in block_uuid when downloading block output. The regex for the block output download endpoint would not capture block_uuids with slashes in them, so this fixes that.
-
Fix renaming block.
-
Fix user auth when disable notebook edits is enabled.
-
Allow JWT_SECRET to be modified via env var. The
JWT_SECRET
for encoding and decoding access tokens was hardcoded, the fix allows users to update it through an environment variable. -
Hide duplicate shortcut items in editor context menu
- Before (after running the block a few times and removing/adding block connections):
- After (after following the same steps and running the block a few times and removing/adding block connections):
-
When changing the name of a block or creating a new block, auto-create non-existent folders if the block name is using nested block names.
-
Fix trigger count in pipeline dashboard
-
Fix copy text for secrets
-
Fix git sync
asyncio
issue -
Fix Circular Import when importing
get_secret_value
method -
Shorten branch name in the header. If branch name is greater than 21 characters, show ellipsis.
-
Replace hard-to-read dark blue font in code block output with much more legible yellow font.
-
Show error popup if error occurs when updating pipeline settings.
-
Update tree node when block status changes
-
Prevent sending notification multiple times for multiple block failures