About
This how-to will show you how to:
- define the generation of data via a data definition file
Steps
Prerequisites
You should have Tabulify installed on your computer: Learning Tabulify - Step 1 - Installation
Create the Data Generation file
The below file is a data definition file that defines the structure of the data and the data generators
Comment: An example of date dimension generator based on the `date_dim` table of TPCDS
Columns:
- name: d_date_sk
comment: A surrogate key
Type: integer
DataGenerator:
type: sequence
- name: d_date
comment: A business key in date format
Type: date
DataGenerator:
type: sequence
- name: d_date_id
comment: A business key in string
Type: varchar
DataGenerator:
type: expression
ColumnParents: d_date
expression: "x.toISOString().substring(0,10)"
- name: d_month_seq
comment: An ascendant sequence for the month
Type: integer
DataGenerator:
type: expression
ColumnParents: d_date
expression: "function pad(number) {if (number < 10) { return '0' + number; } return number; }; x.getFullYear()+''+(pad(x.getMonth()+1))"
- name: d_day_name
comment: The name of the day
Type: varchar
DataGenerator:
type: expression
ColumnParents: d_date
expression: "var days = ['Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday']; days[x.getDay()]"
- name: d_moy
comment: the month number in year
Type: Integer
DataGenerator:
type: expression
ColumnParents: d_date
expression: "x.getMonth()+1"
- name: d_year
comment: The year number
Type: Varchar
Precision: 4
DataGenerator:
type: expression
ColumnParents: d_date
expression: "x.getFullYear()"
Fill the table
The tabli data fill command takes the data definition as input and can load it into a table.
tabli data fill --generator-selector date_dim--datagen.yml@howto date_dim@sqlite
Source Target Latency (ms) Row Count Error Message
--------------- --------------- ------------ --------- ----- -------
date_dim@memgen date_dim@sqlite 6435 1000
Verify
The table date_dim was filled with generated data
tabli data head date_dim@sqlite
The first 10 rows of the data resource (date_dim@sqlite):
d_date_sk d_date_id d_date d_month_seq d_week_seq d_quarter_seq d_year d_dow d_moy d_dom d_qoy d_fy_year d_fy_quarter_seq d_fy_week_seq d_day_name d_quarter_name d_holiday d_weekend d_following_holiday d_first_dom d_last_dom d_same_day_ly d_same_day_lq d_current_day d_current_week d_current_month d_current_quarter d_current_year
--------- ---------- ---------- ----------- ---------- ------------- ------ ----- ----- ----- ----- --------- ---------------- ------------- ---------- -------------- --------- --------- ------------------- ----------- ---------- ------------- ------------- ------------- -------------- --------------- ----------------- --------------
1 2020-12-11 2020-12-11 202012 0 8 2020 6 12 4 10 10 9 0 Friday t p v z 1 7 0 9 c q h b s
2 2020-12-10 2020-12-10 202012 6 6 2020 0 12 8 9 5 7 5 Thursday o z s o 7 0 2 4 n n x e v
3 2020-12-09 2020-12-09 202012 0 4 2020 1 12 2 9 0 9 3 Wednesday e f l f 10 4 1 0 s y h q p
4 2020-12-08 2020-12-08 202012 3 3 2020 3 12 9 4 5 1 2 Tuesday w j o b 4 3 9 1 j d v g c
5 2020-12-07 2020-12-07 202012 9 1 2020 5 12 1 1 6 4 5 Monday x w y j 8 4 10 7 e u k t p
6 2020-12-06 2020-12-06 202012 0 8 2020 10 12 5 5 1 9 10 Sunday p r o b 2 4 6 8 u w h s h
7 2020-12-05 2020-12-05 202012 9 6 2020 10 12 0 1 8 3 8 Saturday e r p e 8 0 10 4 n n o d f
8 2020-12-04 2020-12-04 202012 2 1 2020 8 12 10 8 8 9 9 Friday q r p q 8 8 7 3 s s d s m
9 2020-12-03 2020-12-03 202012 9 0 2020 5 12 10 5 7 5 10 Thursday n k e h 3 5 7 1 w z h k q
10 2020-12-02 2020-12-02 202012 5 7 2020 7 12 4 9 7 8 9 Wednesday a b m f 10 5 3 6 s k p s g