Data comparison is the corner stone of every development because it validates the data processing.
Tabulify ships with a Data comparison operation that allows you to compare:
The comparison operation is implemented with the data compare command.
For the sake of simplicty, we will compare two csv data resource but you may compare any content data resource such as a SQL table or SQL query
The fact that a record is missing can be easily spotted but there is a more subtle difference.
All comparison can be executed with the data compare command by giving as argument the source and target data resource.
The default report is to show a high level that lists the source and target resources and the result of the comparison:
There is a total of 4 records that were compared and 3 were differents. It doesn't seem quite right, let's take a look at it by asking a comparison report at the record level to see the difference in details.
To get a comparison on the record level, you set the report-level option to the record value.
The report shows 3 columns that starts with the prefix comp:
All the columns afterwards are column names from the source.
In this report, we can see that on the record_id 3, the target record Kahneman was compared to the source record Harbison Carnagey.
This is not right because the source has also a Kahneman record and they should have been compared together.
This is because the comparison was executed by record id and not by last name. The next step will show you how to define the unique column used to drive the comparison.
To improve our comparison, we will define the unique column of our data resource with the --unique-column column option.
This time the comparison has:
We can see that:
This page end the learning guide of Tabulify at the command line with Tabli where we have learned to perform data operation one command at a time.