Tabulify - How to select data resources with a Glob Pattern
About
With this page, you will learn what a glob pattern is and how to use it to select data resources.
A glob pattern or glob expression is a string that can be matched or not against another string. If the glob pattern matches, a data resource name, the data resource is selected otherwise it's not.
To express the pattern, special characters called wildcard are used that have special meaning. The following paragraphs go through each of this wildcard and shows you how to use them.
For the hackers, this is like a regular expression but simplified
Steps
Prerequisites
You should have Tabulify installed on your computer.
Learning Tabulify - Step 1 - Installation
Star
The star character * also known as asterix matches any number of characters.
If we want to select all data resources that ends with the term sales, we will use the following glob pattern
*sales
where:
- * matches all characters before sales
- sales matches itself.
Example:
tabli data list *sales@tpcds
PATH
-------------
catalog_sales
store_sales
web_sales
Question mark
A question mark, ?, matches exactly one character.
Example:
- ???? - Matches all data resources with exactly four letters or digits
tabli data list ????@tpcds
PATH
----
item
- w?*sales - Matches any string beginning with w, followed by at least one letter or digit, and ending with sales
tabli data list w?*sales@tpcds
PATH
---------
web_sales
Braces
Braces {} specify a collection of subpatterns.
For example:
- {item,store} matches item or store
tabli data list {item,store}@tpcds
PATH
-----
item
store
- {web*,store*} matches all data resource names that begins with web or store.
tabli data list {web*,store*}@tpcds
PATH
-------------
store
store_returns
store_sales
web_page
web_returns
web_sales
web_site
Square brackets
Square brackets [] convey:
- a set of single characters
- or a range of characters when the hyphen character (-) is used
that matches a single character. Within the square brackets, the wildcard *, ?, and \ match themselves.
Example:
- [aeiou] matches any lowercase vowel.
- [0-9] matches any digit.
- [A-Z] matches any uppercase letter.
- [a-z,A-Z] matches any uppercase or lowercase letter.
- [!abc] exclusion does not match the set of characters abc
- *[0-9]* ? Matches all strings containing a numeric value
Example:
- all data resource that does not have a wy in their name
tabli data list *[wy]*@tpcds
PATH
--------------------
inventory
s_inventory
s_warehouse
s_web_order
s_web_order_lineitem
s_web_page
s_web_returns
s_web_site
warehouse
web_page
web_returns
web_sales
web_site
Double Star (Recursive Search)
You can also search container such as directory or schema recursively by adding to your glob pattern the double start (or asterix).
For instance, the below patter will search the file that starts with RE in the current directory and all sub-directories of the howto connection directory
tabli data list **/RE*@howto
- And there is two README.md files found that match the pattern
PATH
-------------------
README.md
recursive\README.md
Escape
The escape character is the backslash \ and turns a wildcard into a simple characters.
Syntax
\wildcard
For example:
- \\ will match a single backslash \
- \? matches the question mark ?
- \* matches the star *