To select a data resources such as a file or a database table, Tabulify uses the concept of:
This page goes through this concepts with explanation and examples.
A data selector is composed of two parts:
A data selector looks like that:
globPattern@connection
A glob pattern permits to define the name or the path of the data resource located in its system connection.
For instance, with the internal TPC-DS data store, the below list command will select all tables that ends with the term sales because the * character matches all characters.
tabul data list *sales@tpcds
where:
Output:
path media_type
------------- ------------
catalog_sales sql/relation
store_sales sql/relation
web_sales sql/relation
To get more practice on glob pattern, you can have a look at this page. Tabulify - How to select data resources with a Glob Pattern
When moving data due to foreign-key constraint, you need to move the data resources and their dependencies.
That's why Tabulify offers the --with-dependencies flag that will select also the dependent resources of the selected data resource
Example: All tables that have a name that ends with sales in the tpcds system and their dependent tables
tabul data list --with-dependencies *sales@tpcds
path media_type
---------------------- ------------
call_center sql/relation
catalog_page sql/relation
catalog_sales sql/relation
customer sql/relation
customer_address sql/relation
customer_demographics sql/relation
date_dim sql/relation
household_demographics sql/relation
income_band sql/relation
item sql/relation
promotion sql/relation
ship_mode sql/relation
store sql/relation
store_sales sql/relation
time_dim sql/relation
warehouse sql/relation
web_page sql/relation
web_sales sql/relation
web_site sql/relation
The connection part of a data selector is not mandatory as the default connection is the local file system.
Therefore, performing the list command with a data selector without connection will give you a list of the file in your current directory.
tabul data list *
path
-----------------------
README.md
characters.csv
date_dim--generator.yml
sequence--generator.yml
This is then the equivalent of the ls command
Now that we know how to select data resources, the next page will show you how to print their content.