Learning Tabulify - Step 9 - How to fill a data resource with generated data ?

Data Generation

Tabulify integrates natively a data generator.

You can generate realistic production data and start working on your project right away.

Because the data is fake but realistic, you don't need to:

  • anonymize production data in your development environment because of the privacy laws
  • or create any acceptance environment.

The fill command

The data fill operation is an operation that will select target data resource and fill them with data.

Tabulify supports two mode:

  • auto - the data generated is automatically chosen
  • generator - the data generated is defined in a file called the generator

The fill operation is supported by the data fill command.

Auto Fill

Let's first delete all data with the data truncate command to get a clean schema.

tabli data truncate *@sqlite

The below fill command will fill all tables with auto-generated data

The data fill command loads 1000 records for each table because this is the default value of the max-record-count option ( This option defines the number of records generated ).

Query 11

By running the query 11 (of the query lesson), we don't get any data back.

tabli data print (sqlite/[email protected]_query)@sqlite
customer_id   customer_first_name   customer_last_name   customer_email_address
-----------   -------------------   ------------------   ----------------------


Why ? Because the query 11 is based on time data of the year 2001 and unfortunately the auto-generated data does not contain 2001 in the d_year column.

tabli data head --limit 10 [email protected]
The first 10 rows of the data resource ([email protected]):
d_date_sk   d_date_id   d_date       d_month_seq   d_week_seq   d_quarter_seq   d_year   d_dow   d_moy   d_dom   d_qoy   d_fy_year   d_fy_quarter_seq   d_fy_week_seq   d_day_name   d_quarter_name   d_holiday   d_weekend   d_following_holiday   d_first_dom   d_last_dom   d_same_day_ly   d_same_day_lq   d_current_day   d_current_week   d_current_month   d_current_quarter   d_current_year
---------   ---------   ----------   -----------   ----------   -------------   ------   -----   -----   -----   -----   ---------   ----------------   -------------   ----------   --------------   ---------   ---------   -------------------   -----------   ----------   -------------   -------------   -------------   --------------   ---------------   -----------------   --------------
        1   a           2020-12-06             8            7               8        3       5       8       2       2           1                  3               2   j            e                r           w           q                               8            4               9               0   x               g                b                 k                   i
        2   b           2020-12-04             0            6               7        0       1       5       6       5           4                  4               4   t            n                h           t           l                               3            8               4               5   b               j                x                 l                   a
        3   c           2020-12-12             5            3               1        7       9       0       0       7           9                  5               9   m            c                c           b           g                               1            0               7               1   o               h                h                 f                   a
        4   d           2020-12-12             9            4               9        9       0       4       9       2           8                 10               9   b            l                b           d           o                               8            2              10               1   d               h                k                 n                   n
        5   e           2020-12-02             4            8               3        6       2       2       3       8           8                  8               8   i            e                e           g           k                               3            1               8               0   b               l                w                 a                   p
        6   f           2020-12-09             8            2               0       10       3       6       2       0           5                  2               6   i            u                c           i           o                               4            1               3               5   e               q                p                 t                   v
        7   g           2020-12-10             5           10              10        8       6       2       1       3          10                  7               2   e            o                h           h           a                               9            5               8               8   q               p                k                 n                   f
        8   h           2020-12-06             1            6              10        7       7       5       5       2           1                 10               1   t            w                f           o           k                              10           10               4               4   y               j                m                 j                   h
        9   i           2020-12-06             6            2               8        1       5       1       5       3           6                  9              10   s            q                l           v           d                               6            5               0               6   o               f                v                 n                   x
       10   j           2020-12-03             6            8               6        8       6       0       3       7           9                  6               2   v            s                w           o           a                               1           10               1               6   w               n                v                 x                   q

To update the column dyear with data from the year 2001, we will use a generator in the next section.

Generator

A generator is a file that contains the data generation definition.

For each column, a column data generator is defined that control the data generated.

The below generator generates one year of data with two columns:

LogicalName: date_dim
Columns:
  - name: d_date
    type: date
    comment: A column with a sequence generator that generates a date sequence from 2001-01-01 and after
    DataGenerator:
      type: sequence
      start: 2001-01-01
      step: 1
  - name: d_year
    precision: 4
    comment: A column with a expression generator that extract the year from the date column
    DataGenerator:
      type: expression
      ColumnParents: d_date
      expression: "x.getFullYear()"


This generator is also a content resource and therefore you can use it as any tabular resource and take a look at the data generated

tabli data head [email protected]
The first 10 rows of the data resource ([email protected]):
d_date       d_year
----------   ------
2001-01-01   2001
2001-01-02   2001
2001-01-03   2001
2001-01-04   2001
2001-01-05   2001
2001-01-06   2001
2001-01-07   2001
2001-01-08   2001
2001-01-09   2001
2001-01-10   2001

Fill with generators

After having created a generator for the date_dim table, we can pass it to the data fill command with the –generator-selector option to make the data generation more controled.

tabli data fill --generator-selector [email protected]   *@sqlite

As the option generator-selector is a resource selector, you can create a generator for each table where you want to customize the generated data and select them with the glob pattern.

Output:

Source                          Target                          Latency (ms)   Row Count   Error   Message
-----------------------------   -----------------------------   ------------   ---------   -----   -------
[email protected]              [email protected]              131            1000
[email protected]             [email protected]             49             1000
[email protected]            [email protected]            130            1000
[email protected]                 [email protected]                 71             1000
custom[email protected]         [email protected]         64             1000
[email protected]    [email protected]    84             1000
[email protected]                 [email protected]                 1794           1000
[email protected]   [email protected]   32             1000
[email protected]              [email protected]              30             1000
[email protected]                     [email protected]                     145            1000
[email protected]                [email protected]                77             1000
[email protected]                [email protected]                42             1000
[email protected]                    [email protected]                    108            1000
[email protected]              [email protected]              94             1000
[email protected]                 [email protected]                 53             1000
[email protected]                [email protected]                79             1000
[email protected]                 [email protected]                 66             1000
[email protected]                [email protected]                131            1000
[email protected]                 [email protected]                 122            1000

And the query 11 is now giving back a result. The generated data is minimal and should be further defined.

tabli data print (sqlite/[email protected]_query)@sqlite
customer_id   customer_first_name   customer_last_name   customer_email_address
-----------   -------------------   ------------------   ----------------------
bbi           g                     k                    e
bbx           j                     a                    w
bhj           h                     r                    i
cw            z                     w                    p
ga            q                     e                    d
kd            w                     f                    h
sl            a                     x                    q

Next

Learn how to compare data resource.

Data Resource Comparison




Related Pages
Learning Tabulify - Step 8 - How to modify the content of a table

Tabulify learning guidetransfer step The transfer page should have made you familiar with the transfer concept where we have introduced the copy operation. The copy and move operation does not modify...

Task Runner