How to get data from a list of values at random

About

This how-to shows you how to generate data from a list of values at random data with the column histogram generator.

Tip: The random column generator can also generate all primary data type (number, date, string) at random. See How to generate random data

Steps

Creation of the generator file

To generate data, you need to create a generator file that will describe the data to be generated.

The below data resource generator:

  • has the name histogram_random–datagen.yml
  • has the simple name histogram_random
  • will generate 30 values (MaxRecordCount count)
  • has a column named id that has sequence data generator that:
    • starts by default at the value 1
    • increments by default with the value 1
  • has a column named buckets_map with a histogram generator where the buckets property defines a map where:
    • the key is the value to generate
    • the value is the chance factor of generation (the more, the more chance that you get the value generated)
  • has a column named buckets_list with a histogram generator where the buckets property defines:
    • a list of values (the chance factor have by default a value of 1)

The two buckets columns (buckets_map and buckets_list) are equivalent.

They defines the buckets as being:

  • a list of values
  • with a factor of chance of value 1.
MaxRecordCount: 10
Columns:
  - name: id
    type: integer
    comment: A id column to see easily the number of values generated
    DataGenerator:
      type: sequence
  - name: bucket_map
    type: varchar
    comment: A column with a random color generator and a map of values with the chance factor
    DataGenerator:
      type: histogram
      Buckets:
        blue: 1
        red: 1
        green: 1
  - name: bucket_list
    type: varchar
    comment: A column with a random color generator and a list of values
    DataGenerator:
      type: histogram
      Buckets:
        - blue
        - red
        - green




Printing the data

With the data print command, we can print the 30 values generated.

tabli data print histogram_random--datagen.yml@howto

howto is the connection that contains the files used in the HowTo's.

id   bucket_map   bucket_list
--   ----------   -----------
 1   green        blue
 2   red          blue
 3   red          green
 4   green        red
 5   blue         red
 6   red          red
 7   green        blue
 8   red          blue
 9   green        blue
10   green        green

Next

Because a generator is just a data resource, you can use it in every data operation.

How to use a generator in a data operation




Related Pages
Undraw Data Processing
Random Data Generator

A random generator is a column data generator that generates data randomly inside a range of values. histogram generator1 This generator will generate the values in an uniform distribution. Key...

Task Runner