pypi dbt 0.4.0
dbt version 0.4.0

latest releases: 1.0.0.34.2, 1.0.0.34.1, 1.0.0.34.0...
7 years ago

dbt v0.4.0

dbt v0.4.0 provides new ways to materialize models in your database.

0. tl;dr

  • new types of materializations: incremental and ephemeral
  • if upgrading, change materialized: true|false to materialized: table|view|incremental|ephemeral
  • optionally specify model configs within the SQL file

1. Feature: {{this}} template variable #81

The {{this}} template variable expands to the name of the model being compiled. For example:

-- my_model.sql
select 'the fully qualified name of this model is {{ this }}'
-- compiles to
select 'the fully qualified name of this model is "the_schema"."my_model"'

2. Feature: materialized: incremental #90

After initially creating a table, incremental models will insert new records into the table on subsequent runs. This drastically speeds up execution time for large, append-only datasets.

Each execution of dbt run will:

  • create the model table if it doesn't exist
  • insert new records into the table

New records are identified by a sql_where model configuration option. In practice, this looks like:

sessions:
    materialized: incremental
    sql_where: "session_start_time > (select max(session_start_time) from {{this}})"

There are a couple of new things here. Previously, materialized could either be set to true or false. Now, the valid options include view, table, incremental, and ephemeral (more on this last one below). Also note that incremental models generally require use of the {{this}} template variable to identify new records.

The sql_where field is supplied as a where condition on a subquery containing the model definition. This resultset is then inserted into the target model. This looks something like:

insert into schema.model (
    select * from (
        -- compiled model definition
    ) where {{sql_where}}
)

3. Feature: materialized: ephemeral #78

Ephemeral models are injected as CTEs (with statements) into any model that references them. Ephemeral models are part of the dependency graph and generally function like any other model, except ephemeral models are not compiled to their own files or directly created in the database. This is useful for intermediary models which are shared by other downstream models, but shouldn't be queried directly from outside of dbt.

To make a model ephemeral:

employees:
    materialized: ephemeral

Suppose you wanted to exclude employees from your users table, but you don't want to clutter your analytics schema with an employees table.

-- employees.sql
select * from public.employees where is_deleted = false

-- users.sql
select *
from {{ref('users')}}
where email not in (select email from {{ref('employees')}})

The compiled SQL would look something like:

with __dbt__CTE__employees as (
  select * from public.employees where is_deleted = false
)
select *
from users
where email not in (select email from __dbt__CTE__employees)

Ephemeral models play nice with other ephemeral models, incremental models, and regular table/view models. Feel free to mix and match different materialization options to optimize for performance and simplicity.

4. Feature: In-model configs #88

Configurations can now be specified directly inside of models. These in-model configs work exactly the same as configs inside of the dbt_project.yml file.

An in-model-config looks like this:

-- users.sql

-- python function syntax
{{ config(materialized="incremental", sql_where="id > (select max(id) from {{this}})") }}
-- OR json syntax
{{ 
    config({"materialized:" "incremental", "sql_where" : "id > (select max(id) from {{this}})"})
}}

select * from public.users

The config resolution order is:

  1. dbt_project.yml model-defaults
  2. in-model config
  3. dbt_project.yml models config

5. Fix: dbt seed null values #102

Previously, dbt seed would insert empty CSV cells as "None", whereas they should have been NULL. Not anymore!

Don't miss a new dbt release

NewReleases is sending notifications on new releases.