---json
{
"canonical": "tpcds",
"description": "This page talks about the TPC-DS benchmark and how you can use it with Tabulify",
"images": [
{ "path": ":docs:system:tpc.ico" },
{ "path": ":docs:system:tpc.png" }
],
"name": "Tpc-ds",
"title": "How to execute the TPC-DS benchmark data and query"
}
---
====== Tabulify - TPC-DS (Benchmark) ======
{{ :howto:database:tpc.svg?200|}}
===== About =====
''Tabulify'' supports the [[http://www.tpc.org/tpcds/|Tpc-Ds]] [[database|database]] benchmark on the following points:
* the [[howto:database:tabli_create_table_with_dependencies|creation of schema]]
* the [[howto:getting_started:6_transfer_data_resource|transfer of data]]
* the execution of [[docs:connection:tpcds_query|TPC-DS queries]]
TPC-DS is a widely recognized benchmark for evaluating the performance of data warehouses and analytical databases. It involves a dataset spread across 24 tables.
The benchmark includes 99 complex queries designed to test various aspects of database performance, such as joins, aggregations, and subqueries.
The TPC-DS schema is based on a snowflake schema, representing real-world scenarios like
* web,
* catalog,
* and store sales.
===== Size 1TB =====
TPC-DS 1TB involves a dataset of approximately 1TB in size,
containing around 6.35 billion records
spread across 24 tables.
The 1TB scale is considered a moderate size for data warehouses
but is still challenging due to the complexity of the queries
and the large number of records
===== Operations =====
==== Schema Management====
This section shows you how to manage the sub-schema of ''TPC-DS''
=== All tables ===
''tpcds'' - all ''TPC-DS'' tables
* [[docs:tabli:data:list]]
tabli data list *@tpcds
* [[docs:tabli:data:create]]
tabli data create *@tpcds @targetConnection
* [[docs:tabli:data:fill]]
tabli data fill *@tpcds @targetConnection
=== Dwh ===
the data-warehouse tables - all tables without the tables that starts with a ''s'' (ie without the staging tables)
* [[docs:tabli:data:list]]
tabli data list [!s]*@tpcds
* [[docs:tabli:data:create]]
tabli data create [!s]*@tpcds @targetConnection
* [[docs:tabli:data:fill]]
tabli data fill [!s]*@tpcds @targetConnection
=== Store Sales ===
The ''store-sales'' schema has the ''store_sales'' and ''store_return'' star schema (a data-warehouse schema).
* [[docs:tabli:data:list]]
tabli data list --with-dependencies store*@tpcds
* [[docs:tabli:data:create]]: With the same argument, you can create the tables
tabli data create --with-dependencies store*@tpcds @targetConnection
* [[docs:tabli:data:copy]]
tabli data copy --with-dependencies store*@tpcds @targetConnection
This article explains this technic: [[howto:database:tabli_create_table_with_dependencies|how to select a star schema]]
==== Note on the schema ====
The ''TPC-DS'' benchmark does not define the ''B'' column (''business key'') as unique key. Our implementation makes them all unique (except on the ''item'' table where the column is unique only with the start and end date)
Why ? Because when using ''TPC-DS'' as a sample schema, the [[docs:generator:generator|data generator]] will then create data that is consistent with the [[docs:connection:tpcds_query|queries]].
For TPC-DS, a business key is neither a primary key nor a foreign key in the context of the data warehouse schema. It is only used to differentiate new data from update data of the source tables during the data maintenance operations.