Table of Contents

Learning Tabulify - Step 4 - How to select Data Resources

Concepts

To select a data resources such as a file or a database table, Tabulify uses the concept of:

This page goes through this concepts with explanation and examples.

Data Selector

A data selector is composed of two parts:

A data selector looks like that:

globPattern@connection

A glob pattern permits to define the name or the path of the data resource located in its system connection.

Normal Selection

For instance, with the internal TPC-DS data store, the below list command will select all tables that ends with the term sales because the * character matches all characters.

tabul data list *sales@tpcds

where:

Output:

path            media_type
-------------   ------------
catalog_sales   sql/relation
store_sales     sql/relation
web_sales       sql/relation

To get more practice on glob pattern, you can have a look at this page. Tabulify - How to select data resources with a Glob Pattern

Selection with dependencies

When moving data due to foreign-key constraint, you need to move the data resources and their dependencies.

That's why Tabulify offers the --with-dependencies flag that will select also the dependent resources of the selected data resource

Example: All tables that have a name that ends with sales in the tpcds system and their dependent tables

tabul data list --with-dependencies *sales@tpcds
path                     media_type
----------------------   ------------
call_center              sql/relation
catalog_page             sql/relation
catalog_sales            sql/relation
customer                 sql/relation
customer_address         sql/relation
customer_demographics    sql/relation
date_dim                 sql/relation
household_demographics   sql/relation
income_band              sql/relation
item                     sql/relation
promotion                sql/relation
ship_mode                sql/relation
store                    sql/relation
store_sales              sql/relation
time_dim                 sql/relation
warehouse                sql/relation
web_page                 sql/relation
web_sales                sql/relation
web_site                 sql/relation

Local File System

The connection part of a data selector is not mandatory as the default connection is the local file system.

Therefore, performing the list command with a data selector without connection will give you a list of the file in your current directory.

tabul data list *
path
-----------------------
README.md
characters.csv
date_dim--generator.yml
sequence--generator.yml

This is then the equivalent of the ls command

Next

Now that we know how to select data resources, the next page will show you how to print their content.

How to print Data Resources