github emqx/emqx v5.6.0
EMQX v5.6.0

15 days ago

Enhancements

  • #12251 Optimized the performance of the RocksDB-based persistent sessions, achieving a reduction in RAM usage and database request frequency. Key improvements include:

    • Introduced dirty session state to avoid frequent mria transactions.
    • Introduced an intermediate buffer for the persistent messages.
    • Used separate tracks of PacketIds for QoS1 and QoS2 messages.
    • Limited the number of continuous ranges of inflight messages to 1 per stream.
  • #12326 Enhanced session tracking with registration history. EMQX now has the capability to monitor the history of session registrations, including those that have expired. By configuring broker.session_history_retain, EMQX retains records of expired sessions for a specified duration.

    • Session count API: Use the API GET /api/v5/sessions_count?since=1705682238 to obtain a count of sessions across the cluster that remained active since the given UNIX epoch timestamp (with seconds precision). This enhancement aids in analyzing session activity over time.

    • Metrics expansion with cluster sessions gauge: A new gauge metric, cluster_sessions, is added to better track the number of sessions within the cluster. This metric is also integrated into Prometheus for easy monitoring:

      # TYPE emqx_cluster_sessions_count gauge
      emqx_cluster_sessions_count 1234
      

      NOTE: Please consider this metric as an approximate estimation. Due to the asynchronous nature of data collection and calculation, exact precision may vary.

  • #12338 Introduced a time-based garbage collection mechanism to the RocksDB-based persistent session backend. This feature ensures more efficient management of stored messages, optimizing storage utilization and system performance by automatically purging outdated messages.

  • #12398 Exposed the swagger_support option in the Dashboard configuration, allowing for the enabling or disabling of the Swagger API documentation.

  • #12467 Started supporting cluster discovery using AAAA DNS record type.

  • #12483 Renamed emqx ctl conf cluster_sync tnxid ID to emqx ctl conf cluster_sync inspect ID.

    For backward compatibility, tnxid is kept, but considered deprecated and will be removed in 5.7.

  • #12499 Enhanced client banning capabilities with extended rules, including:

    • Matching clientid against a specified regular expression.
    • Matching client's username against a specified regular expression.
    • Matching client's peer address against a CIDR range.

    Important Notice: Implementing a large number of broad matching rules (not specific to an individual clientid, username, or host) may affect system performance. It's advised to use these extended ban rules judiciously to maintain optimal system efficiency.

  • #12509 Implemented API to re-order all authenticators / authorization sources.

  • #12517 Configuration files have been upgraded to accommodate multi-line string values, preserving indentation for enhanced readability and maintainability. This improvement utilizes """~ and ~""" markers to quote indented lines, offering a structured and clear way to define complex configurations. For example:

    rule_xlu4 {
      sql = """~
        SELECT
          *
        FROM
          "t/#"
      ~"""
    }
    

    See HOCON 0.42.0 release notes for details.

  • #12520 Implemented log throttling. The feature reduces the volume of logged events that could potentially flood the system by dropping all but the first occurance of an event within a configured time window.
    Log throttling is applied to the following log events that are critical yet prone to repetition:

    • authentication_failure
    • authorization_permission_denied
    • cannot_publish_to_topic_due_to_not_authorized
    • cannot_publish_to_topic_due_to_quota_exceeded
    • connection_rejected_due_to_license_limit_reached
    • dropped_msg_due_to_mqueue_is_full
  • #12561 Implemented HTTP APIs to get the list of client's in-flight and message queue (mqueue) messages. These APIs facilitate detailed insights and effective control over message queues and in-flight messaging, ensuring efficient message handling and monitoring.

    To get the first chunk of data:

    • GET /clients/{clientid}/mqueue_messages?limit=100
    • GET /clients/{clientid}/inflight_messages?limit=100

    Alternatively, for the first chunks without specifying a start position:

    • GET /clients/{clientid}/mqueue_messages?limit=100&position=none
    • GET /clients/{clientid}/inflight_messages?limit=100&position=none

    To get the next chunk of data:

    • GET /clients/{clientid}/mqueue_messages?limit=100&position={position}
    • GET /clients/{clientid}/inflight_messages?limit=100&position={position}

    Where {position} is a value (opaque string token) of meta.position field from the previous response.

    Ordering and Prioritization:

    • Mqueue Messages: These are prioritized and sequenced based on their queue order (FIFO), from higher to lower priority. By default, mqueue messages carry a uniform priority level of 0.
    • In-Flight Messages: Sequenced by the timestamp of their insertion into the in-flight storage, from oldest to newest.
  • #12590 Removed mfa meta data from log messages to improve clarity.

  • #12641 Improved text log formatter fields order. The new fields order is as follows:

    tag > clientid > msg > peername > username > topic > [other fields]

  • #12670 Added field shared_subscriptions to endpoint /monitor_current and /monitor_current/nodes/:node.

  • #12679 Upgraded docker image base from Debian 11 to Debian 12.

  • #12700 Started supporting "b" and "B" unit in bytesize hocon fields. For example, all three fields below will have the value of 1024 bytes:

    bytesize_field = "1024b"
    bytesize_field2 = "1024B"
    bytesize_field2 = 1024
    
  • #12719 The /clients API has been upgraded to accommodate queries for multiple clientids and usernames simultaneously, offering a more flexible and powerful tool for monitoring client connections. Additionally, this update introduces the capability to customize which client information fields are included in the API response, optimizing for specific monitoring needs.

    Examples of Multi-Client/Username Queries:

    • To query multiple clients by ID: /clients?clientid=client1&clientid=client2
    • To query multiple users: /clients?username=user11&username=user2
    • To combine multiple client IDs and usernames in one query: /clients?clientid=client1&clientid=client2&username=user1&username=user2

    Examples of Selecting Fields for the Response:

    • To include all fields in the response: /clients?fields=all (Note: Omitting the fields parameter defaults to returning all fields.)
    • To specify only certain fields: /clients?fields=clientid,username
  • #12381 Added new SQL functions: map_keys(), map_values(), map_to_entries(), join_to_string(), join_to_string(), join_to_sql_values_string(), is_null_var(), is_not_null_var().

    For more information on the functions and their usage, refer to Built-in SQL Functions the documentation.

  • #12336 Performance enhancement. Created a dedicated async task handler pool to handle client session cleanup tasks.

  • #12725 Implemented REST API to list the available source types.

  • #12746 Added username log field. If MQTT client is connected with a non-empty username the logs and traces will include username field.

  • #12785 Added timestamp_format configuration option to log handlers. This new option allows for the following settings:

    • auto: Automatically determines the timestamp format based on the log formatter being used.
      Utilizes rfc3339 format for text formatters, and epoch format for JSON formatters.

    • epoch: Represents timestamps in microseconds precision Unix epoch format.

    • rfc3339: Uses RFC3339 compliant format for date-time strings. For example, 2024-03-26T11:52:19.777087+00:00.

Bug Fixes

  • #11868 Fixed a bug where will messages were not published after session takeover.

  • #12347 Implemented an update to ensure that messages processed by the Rule SQL for the MQTT egress data bridge are always rendered as valid, even in scenarios where the data is incomplete or lacks certain placeholders defined in the bridge configuration. This adjustment prevents messages from being incorrectly deemed invalid and subsequently discarded by the MQTT egress data bridge, as was the case previously.

    When variables in payload and topic templates are undefined, they are now rendered as empty strings instead of the literal undefined string.

  • #12472 Fixed an issue where certain read operations on /api/v5/actions/ and /api/v5/sources/ endpoints might result in a 500 error code during the process of rolling upgrades.

  • #12492 EMQX now returns the Receive-Maximum property in the CONNACK message for MQTT v5 clients, aligning with protocol expectations. This implementation considers the minimum value of the client's Receive-Maximum setting and the server's max_inflight configuration as the limit for the number of inflight (unacknowledged) messages permitted. Previously, the determined value was not sent back to the client in the CONNACK message.

  • #12500 The GET /clients and GET /client/:clientid HTTP APIs have been updated to include disconnected persistent sessions in their responses.

    NOTE: A current known issue with these enhanced API responses is that the total client count provided may exceed the actual number of clients due to the inclusion of disconnected sessions.

  • #12513 Changed the level of several flooding log events from warning to info.

  • #12530 Improved the error reporting for frame_too_large events and malformed CONNECT packet parsing failures. These updates now provide additional information, aiding in the troubleshooting process.

  • #12541 Introduced a new configuration validation step for autocluster by DNS records to ensure compatibility between node.name and cluster.discover_strategy. Specifically, when utilizing the dns strategy with either a or aaaa record types, it is mandatory for all nodes to use a (static) IP address as the host name.

  • #12562 Added a new configuration root: durable_storage. This configuration tree contains the settings related to the new persistent session feature.

  • #12566 Enhanced the bootstrap file for REST API keys:

    • Empty lines within the file are now skipped, eliminating the previous behavior of generating an error.

    • API keys specified in the bootstrap file are assigned the highest precedence. In cases where a new key from the bootstrap file conflicts with an existing key, the older key will be automatically removed to ensure that the bootstrap keys take effect without issue.

  • #12646 Fixed an issue with the rule engine's date-time string parser. Previously, time zone adjustments were only effective for date-time strings specified with second-level precision.

  • #12652 Fixed a discrepancy where the subbits functions with 4 and 5 parameters, despite being documented, were missing from the actual implementation. These functions have now been added.

  • #12663 Fixed an issue where the emqx_vm_cpu_use and emqx_vm_cpu_idle metrics, accessible via the Prometheus endpoint /prometheus/stats, were inaccurately reflecting the average CPU usage since the operating system boot. This fix ensures that these metrics now accurately represent the current CPU usage and idle, providing more relevant and timely data for monitoring purposes.

  • #12668 Refactored the SQL function date_to_unix_ts() by using calendar:datetime_to_gregorian_seconds/1.
    This change also added validation for the input date format.

  • #12672 Changed the process for generating the node boot configuration by incorporating the loading of {data_dir}/configs/cluster.hocon. Previously, changes to logging configurations made via the Dashboard and saved in {data_dir}/configs/cluster.hocon were only applied after the initial boot configuration was generated using etc/emqx.conf, leading to potential loss of some log segment files due to late reconfiguration.

    Now, both {data_dir}/configs/cluster.hocon and etc/emqx.conf are loaded concurrently, with settings from emqx.conf taking precedence, to create the boot configuration.

  • #12696 Fixed an issue where attempting to reconnect an action or source could lead to wrong error messages being returned in the HTTP API.

  • #12714 Fixed inaccuracies in several metrics reported by the /prometheus/stats endpoint of the Prometheus API. The correction applies to the following metrics:

    • emqx_cluster_sessions_count
    • emqx_cluster_sessions_max
    • emqx_cluster_nodes_running
    • emqx_cluster_nodes_stopped
    • emqx_subscriptions_shared_count
    • emqx_subscriptions_shared_max

    Additionally, this fix rectified an issue within the /stats endpoint concerning the subscriptions.shared.count and subscriptions.shared.max fields. Previously, these values failed to update promptly following a client's disconnection or unsubscription from a Shared-Subscription.

  • #12715 Fixed a crash that could occur during configuration updates if the connector for the ingress data integration source had active channels.

  • #12740 Fixed an issue when durable sessions could not be kicked out.

  • #12768 Addressed a startup failure issue in EMQX version 5.4.0 and later, particularly noted during rolling upgrades from versions before 5.4.0. The issue was related to the initialization of the routing schema when both v1 and v2 routing tables were empty.

    The node now attempts to retrieve the routing schema version in use across the cluster instead of using the v2 routing table by default when local routing tables are found empty at startup. This approach mitigates potential conflicts and reduces the chances of diverging routing storage schemas among cluster nodes, especially in a mixed-version cluster scenario.

    If conflict is detected in a running cluster, EMQX writes instructions on how to manually resolve it in the log as part of the error message with critical severity. The same error message and instructions will also be written on standard error to make sure this message will not get lost even if no log handler is configured.

  • #12786 Added a strict check that prevents replicant nodes from connecting to core nodes running with a different version of EMQX application.
    This check ensures that during the rolling upgrades, the replicant nodes can only work when at least one core node is running the same EMQX release version.

Breaking changes

  • #12576 Starting from 5.6, the "Configuration Manual" document will no longer include the bridges config root.

    A bridge is now either action + connector for egress data integration, or source + connector for ingress data integration.
    Please note that the bridges config (in cluster.hocon) and the REST API path api/v5/bridges still works, but considered deprecated.

  • #12634 Triple-quote string values in HOCON config files no longer support escape sequence.

    The detailed information can be found in this pull request.
    Here is a summary of the impact on EMQX users:

    • EMQX 5.6 is the first version to generate triple-quote strings in cluster.hocon,
      meaning for generated configs, there is no compatibility issue.
    • For user hand-crafted configs (such as emqx.conf) a thorough review is needed
      to inspect if escape sequences are used (such as \n, \r, \t and \\), if yes,
      such strings should be changed to regular quotes (one pair of ") instead of triple-quotes.

Don't miss a new emqx release

NewReleases is sending notifications on new releases.