Tabli - Data Compare Command (Diff)

Undraw Windows

About

data compare is a tabli command that supports the data compare operations (diff) against one or several data resources.

You can compare:

Syntax

tabli data compare -h
Tabli data compare
==================

Performs a comparison (diff) between data resources.

Prerequisites:
  * The data resources to compare should have been sorted in a ascendant order on the unique columns
  * The data resources to compare should have the same number of columns

Unique Columns:
It's highly recommended to set the `--unique-column` option.
The `unique columns` are the unique identifier for a record and drives the gathering of the record to compare.
Without unique columns, the comparison can not tell if a record is absent from one set or another.
By default, this is
  * the record id (ie row id, line id) for a data comparison
  * the column name for a data structure comparison

With the `--report-level` option, you can control the output:
  * `resource` will return a data comparison reported resource by resource (default),
  * `record` will return a data comparison reported record by record (A diff),
  * `all` will return them both

With the `--data-source` option, you can control the compared data:
  * `content` will perform a comparison on the content of the data resource,
  * `content` will perform a comparison on the structure of the data resource (by attribute name)
  * `attribute` will perform a comparison on the attributes of the data resource (by attribute key)

Exit:
If there is a non-equality (ie a diff), the process will exist with an error status.

Data Comparison Row by Row Report:
The row by row report adds the following columns:
  * `comp_id`: a row id of the comparison report
  * `comp_origin `: the origin of the record (the source, the target or both)
  * `comp_comment`: a high level comment that explains the difference
  * `comp_diff_id`: the id of a difference - not null if a difference has been seen. Two records with the same diff id have been compared
  * `comp_origin_id`: the record id if the comparison can not determine an unique column



Examples
--------

 1 - Data comparison between two different queries:


    tabli data compare (queryFile1.sql)@sqlite (queryFile2.sql)@sqlite


 2 - Data comparison between a query and a table on two different systems


    tabli data compare (queryFile.sql)@sqlite table@postgres




Syntax
------


    tabli data compare [options|flags] <source-selector> <target-data-uri>


where:


  Arguments:

    <source-selector>                                       A data selector that select the data resources to compare

    <target-data-uri>                                       A target data URI that defines the target to compare - if not defined, an empty target


  Options:

    -ds,--data-source <content|structure>                   Set the origin of data to use for the comparison

    -rl,--report-level <resource|record|all>                Set the report level returned

    --unique-column <columnName>                            The unique column name of the data resource (Default to record id for data comparison and to `name` for a structure comparison)


  Data Definition Options:

    -sa,--source-attribute <attributeName=value>            Set a source data resource attribute

    -ta,--target-attribute <attributeName=value>            Set a target data resource attribute


  Selection Options:

    -wd,--with-dependencies                                 If set, the dependencies will be also selected


  Global Options:

    -cf,--conf <path>                                       The path to a configuration file

    -cv,--connection-vault <path>                           The path where the connection vault is located

    -h,--help                                               Print this help

    -l,--log-level <error|warning|tip|info|fine>            Set the log level

    -ns,--not-strict                                        A minor error will not stop the process.

    -odu,--output-data-uri <outputDataUri>                  defines the output data uri for the feedback data (default: console)

    -oo,--output-operation <dataOperation>                  defines the data operations (replace, truncate) on an existing output resource before transfer.

    -oop,--output-transfer-operation <transferOperation>    defines the output transfer operation (insert, update, merge, copy). Default to `copy` for a file system and `insert` for a database.

    -pp,--passphrase <passphrase>                           A passphrase (master password) to decrypt the encrypted values




Related HowTo
Undraw Windows
Learning Tabulify - Step 10 - How to compare data resources ?

Data comparison is the corner stone of every development because it validates the data processing. Tabulify ships with a Data comparison operation that allows you to compare: the data content and...

Task Runner