---json
{
"page_id": "bpny66tzanc0lha87nksk"
}
---
====== Histogram Generator ======
===== About =====
A ''histogram generator'' is a [[data-supplier|column data generator]] that [[generator|generates]] a ''value'' according to its ''chance factor''.
This generator is used to generate data that follows a [[howto:generator:histogram_normal_distribution|probability distribution]] where the factor is the probability for the value.
===== Example =====
==== Distribution over week day (Varchar) ====
Example if you want to simulate that people may work the week-end once every 10 weeks, you will define this bucket list
Columns:
- name: bucket_map
type: varchar
data-supplier:
type: histogram
arguments:
Buckets:
Monday: 10
Tuesday: 10
Wednesday: 10
Thursday: 10
Friday: 10
Saturday: 1
Sunday: 1
==== Normal Distribution over Arrival Time ====
Example of normal distribution over time
kind: generator
spec:
MaxRecordCount: 30
Columns:
- name: id
type: integer
comment: A id column to see easily the number of values generated
data-supplier:
type: sequence
- name: bucket_map
type: time
comment: A column with a histogram generator that generates an uniform distribution of time
data-supplier:
type: histogram
arguments:
Buckets:
"8:45": 0.05
"8:50": 0.5
"8:55": 0.22
"9:00": 0.4
"9:05": 0.22
"9:10": 0.5
"9:15": 0.05
===== Arguments =====
==== Buckets ====
This [[docs:generator:data-supplier|data-supplier]] has only one argument that defines the ''histogram'' namely, the ''buckets''.
A ''bucket'' is a ''value'' and its ''chance factor''
===== Data Type =====
The below data type are supported:
^ Name ^ Yaml Format ^ Example ^
| Integer | `d` | `8` |
| Double | `d.dd` | `8.00` |
| [[docs:data_type:date_time|Date]] | `YYYY-MM-DD` | `1970-01-01` |
| [[docs:data_type:date_time|Timestamp]] | `YYYY-MM-DD HH:MM:SS` | `1970-01-01 00:00:00`|
| [[docs:data_type:date_time|Time]] | `"HH:MM"`, `"HH:MM:SS"`, `"HH:MM:SS.SSS"` | `"08:00"` quote is mandatory |
| Varchar | `".*"` | `"a text"` |
Why the time must be quoted? Yaml does not [[https://yaml.org/spec/1.2.2/|support time as a type]]. The time string should be quoted.
===== How to define a Bucket definition in a data set =====
Note that the [[docs:generator:data-set|data set]] and the [[entity|entity]] generator creates histogram from [[:docs:resource:resource|resources]].
* The data being defined by the ''column'' attribute
* the chance factor being defined by the `probability`, `weight` or `factor` column