To select a data resources such as a file or a database table, Tabulify uses the concept of:
- and dependency (Do we select also the dependent data resources)
This page goes through this concepts with explanation and examples.
A data selector is composed of two parts:
- and a connection
- separated by the @ (at) sign.
A data selector looks like that:
A glob pattern permits to define the name or the path of the data resource located in its system connection.
For instance, with the internal TPC-DS data store, the below list command will select all tables that ends with the term sales because the * character matches all characters.
tabli data list *[email protected]
- tabli is the main command line utility
- data is a module (ie the data module)
- list is a command
- *[email protected] is a resource data selector that select data resources.
- tpcds defines the connection
- *sales defines the tables to look for with a glob pattern. In our case all tables that finish with the word sales because * is the globbing star and select all characters.
path ------------- catalog_sales store_sales web_sales
To get more practice on glob pattern, you can have a look at this page. How to select data resources with a Glob Pattern
Selection with dependencies
When moving data due to foreign-key constraint, you need to move the data resources and their dependencies.
That's why Tabulify offers the --with-dependencies flag that will select also the dependent resources of the selected data resource
Example: All tables that have a name that ends with sales in the tpcds system and their dependent tables
tabli data list --with-dependencies *[email protected]
path ---------------------- call_center catalog_page catalog_sales customer customer_address customer_demographics date_dim household_demographics income_band item promotion ship_mode store store_sales time_dim warehouse web_page web_sales web_site
Local File System
The connection part of a data selector is not mandatory as the default connection is the local file system.
Therefore, performing the list command with a data selector without connection will give you a list of the file in your current directory.
tabli data list *
path ----------------------- README.md characters.csv date_dim--datagen.yml sequence--datagen.yml
This is then the equivalent of the ls command
Now that we know how to select data resources, the next page will show you how to print their content.