github StarRocks/starrocks 3.0.0-rc01

latest releases: 4.1.0, 4.0.9, 3.5.16...
pre-release3 years ago

New Features

Architecture

  • Decouple storage and compute. StarRocks now supports data persistence into S3-compatible object storage, enhancing resource isolation, reducing storage costs, and making compute resources more scalable. Local disks are used as hot data cache for boosting query performance. The query performance of the new shared-data architecture is comparable to the classic architecture (shared-nothing) when local cache is hit.

Storage engine and data ingestion

  • The AUTO_INCREMENT attribute is supported to provide globally unique IDs, which simplifies data management.
  • Automatic partitioning and partitioning expressions are supported, which makes partition creation easier to use and more flexible.
  • Primary Key tables support more complete UPDATE and DELETE syntax, including the use of CTEs and references to multiple tables.
  • Added Load Profile for Broker Load and INSERT INTO jobs. You can view the details of a load job by querying the load profile. The usage is the same as Analyze query profile.

Data Lake Analytics

  • [Preview] Supports Presto/Trino compatible dialect. Presto/Trino's SQL can be automatically rewritten into StarRocks' SQL pattern. For more information, see the system variable sql_dialect.
  • [Preview] Supports JDBC catalogs.
  • Supports global UDFs.
  • Supports using SET CATALOG to manually switch between catalogs in the current session.

Privileges and Security

  • Provides a new privilege system with full RBAC functionalities, supporting role inheritance and default roles.
  • Provides more privilege management objects and more fine-grained privileges.

Query engine

  • [Preview] Supports operator spilling for large queries, which can use disk space to ensure stable running of queries in case of insufficient memory.
  • Allows more queries on joined tables to benefit from the query cache. For example, the query cache now supports aggregate queries on multiple tables that are joined by using bucket shuffle joins and broadcast joins.
  • Dynamic adaptive parallelism: StarRocks can automatically adjust the pipeline_dop parameter for query concurrency.

Functions for semi-structured data analysis

  • Added functions map_apply(), map_filter(), transform_keys(), and transform_values().
  • array_agg() supports ORDER BY.
  • Added the string function replace().

Improvements

Storage engine and data ingestion

  • Supports more CSV parameters for data ingestion, including SKIP_HEADER, TRIM_SPACE, ENCLOSE, and ESCAPE. See STREAM LOAD, BROKER LOAD and ROUTINE LOAD.
  • The primary key and sort key are decoupled in Primary Key tables. The sort key can be separately specified in ORDER BY.
  • Optimized the memory usage of data ingestion into Primary Key tables in scenarios such as large-volume ingestion, partial updates, and persistent primary indexes.

Materialized View

  • Optimized the rewriting capabilities of materialized views, including:
    • Supports rewrite of View Delta Join, Outer Join, and Cross Join.
    • Optimized SQL rewrite of Union with partition.
  • Improved materialized view building capabilities: supporting CTE, select *, and Union.
  • Optimized the information returned by SHOW MATERIALIZED VIEWS.

Query engine

  • All operators are supported in the pipeline engine. Non-pipeline code will be removed in later versions.
  • Improved big query positioning and added big query log. SHOW PROCESSLIST supports viewing CPU and memory information.
  • Optimized Outer Join Reorder.
  • Optimized error messages in the SQL parsing stage, providing more accurate error positioning and clearer error messages.

Data Lake Analytics

  • Optimized metadata statistics collection.
  • Supports using SHOW CREATE TABLE to query the schema information of an external table and using SHOW CREATE CATALOG to query the creation statement of an external catalog.

Functions

  • Window functions lead() and lag() support IGNORE NULLS.

Bug Fixes

  • Some URLs in the license header of StarRocks' source file cannot be accessed. # 2224
  • An unknown error is returned during SELECT queries. # 19731
  • Supports SHOW/SET CHARACTER. # 17480
  • When the loaded data exceeds the field length supported by StarRocks, the error message returned is not correct. # 14
  • Supports show full fields from 'table'. # 17233
  • Partition pruning causes MV rewrites to fail. # 14641
  • MV rewrite fails when the CREATE MATERIALIZED VIEW statement contains count(distinct) and count(distinct) is applied to the DISTRIBUTED BY column. # 16558

Behavior Change

The new role-based access control (RBAC) system is backward compatible with the previous privilege system. However, the syntax of related statements such as GRANT and REVOKE is changed. For more information, see SQL Reference > User Account Management.

Don't miss a new starrocks release

NewReleases is sending notifications on new releases.