dbt 0.13.0 新添加特性sources 試用

dbt 0.13 添加了一個新的功能sources 我呢能夠用來作如下事情python

  • 從基礎模型的源表中進行數據選擇
  • 測試對於源數據的假設
  • 計算源數據的freshness

source 操做

  • 定義source 模版格式

    注意對於pg 等類型的,若是包含了schema 的可能須要配置額外參數,或者經過schema 約定git

# This example defines a source called `source_1` containing one table
# called `table_1`. This is a minimal example of a source definition.
version: 2
sources:
  - name: source_1
    tables:
      - name: table_1
      - name: table_2
  - name: source_2
    tables:
      - name: table_1
 
 
  • schema 配置數據源格式
# This source entry describes the table:
# "raw"."public"."Orders_"
#
# It can be referenced with:
# {{ source('ecommerce', 'orders') }}
version: 2
sources:
  - name: ecommerce
    database: raw # Tell dbt to look for the source in the "raw" database
    schema: public # You wouldn't put your source data in public, would you?
    tables:
      - name: orders
        identifier: Orders_ # To alias table names to account for strange casing or naming of tables
 
 

一個簡單例子

我配置的source 直接在model 文件夾中 能夠參考https://github.com/rongfengliang/dbt-source-demo,關於表數據結構
也能夠參考此項目github

  • 環境準備(使用python venv 管理)
python3 -m venv venv 
source venv/bin/activate
pip install dbt
  • 測試數據庫準備(使用docker-compose)
version: '3.6'
services:
  postgres:
    image: postgres:9.6.11
    ports: 
    - "5432:5432"
    environment:
    - "POSTGRES_PASSWORD:dalong"
  graphql-engine:
    image: hasura/graphql-engine:v1.0.0-beta.2
    ports:
    - "8080:8080"
    depends_on:
    - "postgres"
    environment:
    - "HASURA_GRAPHQL_DATABASE_URL=postgres://postgres:dalong@postgres:5432/postgres"
    - "HASURA_GRAPHQL_ENABLE_CONSOLE=true"
    - "HASURA_GRAPHQL_ENABLE_ALLOWLIST=true"
  • model source 配置
models
├── apps
├── app_summary.sql
└── sources.yml
└── users
    ├── sources.yml
    ├── user_summary.sql
    └── user_summary2.sql
  • source 內容

    內容很簡單,就是配置tablesql

version: 2
sources:
  - name: apps
    schema: public
    tables:
      - name: apps
  • 運行效果
dbt run

效果docker

Running with dbt=0.13.1
Found 3 models, 0 tests, 0 archives, 0 analyses, 94 macros, 0 operations, 0 seed files, 2 sources
17:43:42 | Concurrency: 3 threads (target='dev')
17:43:42 | 
17:43:42 | 1 of 3 START view model public.app_summary........................... [RUN]
17:43:42 | 2 of 3 START view model public.user_summary.......................... [RUN]
17:43:42 | 3 of 3 START table model public.user_summary2........................ [RUN]
17:43:44 | 2 of 3 OK created view model public.user_summary..................... [CREATE VIEW in 0.26s]
17:43:45 | 1 of 3 OK created view model public.app_summary...................... [CREATE VIEW in 0.27s]
17:43:46 | 3 of 3 OK created table model public.user_summary2................... [SELECT 2 in 0.27s]
17:43:46 | 
17:43:46 | Finished running 2 view models, 1 table models in 4.46s.
Completed successfully
Done. PASS=3 ERROR=0 SKIP=0 TOTAL=3

參考資料

https://github.com/rongfengliang/dbt-source-demo數據庫

相關文章
相關標籤/搜索