Dummy Data Generator CLI

This CLI tool allows you to efficiently generate a large amount of dummy data in a database. It supports both PostgreSQL and MySQL and provides a flexible configuration file to specify which tables and columns to populate.

Installation

To install the CLI tool, run the following command:

go install github.com/ponyo877/dummy_data_generator

Features

Generate a substantial amount of dummy data in a database.
Supports both PostgreSQL and MySQL.
Customize data generation through a configuration file.
Track progress with a visual progress bar.

Configuration

Field	Description
tablename	Name of the table where the data will be generated.
recordcount	Total number of records to be generated.
buffer	Buffer size for generating records (useful for optimizing performance).
columns	List of columns with their respective configurations.
columns[].name	Name of the column.
columns[].type	Data type of the column (e.g., number, varchar, timestamp).
columns[].rule	Generation rule for the column.
columns[].rule.type	Dummy rule type (e.g., unique, const, pattern, random)
columns[].rule.format	[type: unique only] Dummy data format (e.g., UUID(varchar), ULID(varchar), NOW(timestamp))
columns[].rule.value	[type: const only] Dummy data const value
columns[].rule.min	start of sequential value
columns[].rule.max	[type: pattern only] end of sequential value
columns[].rule.min_time	[type: random (timestamp) only] minimum value for random timestamp
columns[].rule.max_time	[type: random (timestamp) only] maximum value for random timestamp
columns[].patterns[].value	[type: pattern only] repeated value
columns[].patterns[].times	[type: pattern only] value of how many times to repeat

Example Rules:

type: unique: Generates unique values. sequential number(default), current_timestamp(format: NOW), UUID and ULID is supported
type: const: Assigns a constant value.
type: pattern: Generates values based on specified patterns. If you specify [{value: A, times: 2}, {value: B, times: 1}], it will create repeated values like [A,A,B,A,A,B,...] and so on. And if you specify {Min: 1, Max: 5}, it will create repeated values like [1,2,3,4,5,1,2,3,...] and so on.
type: random: Generates random values between two values; min_time and max_time. Only timestamp data type is available as of now. If you specify {min_time: '2024-01-01 00:00:00', max_time: '2024-03-31 23:59:59'}, it will yeild random timestamps between them like '2024-02-01 01:23:45' but not '2023-12-31 23:59:59' or '2024-04-01 00:00:00'.

Example

tablename: sample_tbl
recordcount: 1000000
buffer: 1000
columns:
# The 'id' column is a string in ULID format, ensuring all values are unique.
- name: id
  type: varchar
  rule:
    type: unique
    format: ULID
# The 'sex' column will contain the strings "male", "female", and "NA" in a 3:2:1 ratio.
- name: sex
  type: varchar
  rule:
    type: pattern
    patterns:
    - value: male
      times: 3
    - value: female
      times: 2
    - value: NA
      times: 1
# The 'created_at' column will have the fixed value "2024-01-01 00:00:00".
- name: created_at
  type: timestamp
  rule:
    type: const
    value: '2024-01-01 00:00:00'

Sub Command

Sub Command	Description
dummy_data_generator cnt	show number of record
dummy_data_generator gen	generate dummy data

Option

Option	Description	Default Value
-c, --config	configuration file for dummy data. You can provide multiple configuration files using wildcards (e.g., `-c "cfg_*.yaml"`) or by comma-separating them (e.g., `-c cfg_1.yaml,cfg_2.yaml`).	`config.yaml`
-d, --database	name of the database to use.	`mydb`
-u, --dbuser	database user name.	`root`
-e, --engine	database engine to use. Supports `postgres` and `mysql`.	`postgres`
-h, --host	database server host or socket directory.	`127.0.0.1`
-p, --password	database password to use when connecting to the server.	`password`
-P, --port	database server port.	`5432`

Usage Examples

Example 1: Check current number of records. (MySQL)

$ dummy_data_generator cnt -e mysql -h 127.0.0.1 -u root -P 5432 -p password -c sample_1.yaml,sample_2.yaml
+--------+-------+
| TABLE  | COUNT |
+--------+-------+
| table1 |     0 |
| table2 |     0 |
+--------+-------+

Example 2: Generate dummy data to target table designated config file. (PostgreSQL, all default value without config)

$ dummy_data_generator gen -c "sample_*.yaml"
table1: 534000 / 1000000 in progress  [=====================>-------------------]  53 %
table2: 533000 / 1000000 in progress  [=====================>-------------------]  53 %
table3:    10000 / 10000 done!       [=========================================]

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
cmd		cmd
internal		internal
.gitignore		.gitignore
README.md		README.md
docker-compose.yaml		docker-compose.yaml
go.mod		go.mod
go.sum		go.sum
main.go		main.go
saigen.yaml		saigen.yaml
sample_1.yaml		sample_1.yaml
sample_2.yaml		sample_2.yaml
sample_3.yaml		sample_3.yaml
teardown.sh		teardown.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dummy Data Generator CLI

Installation

Features

Configuration

Example Rules:

Example

Sub Command

Option

Usage Examples

About

Releases

Packages

Contributors 2

Languages

ponyo877/dummy_data_generator

Folders and files

Latest commit

History

Repository files navigation

Dummy Data Generator CLI

Installation

Features

Configuration

Example Rules:

Example

Sub Command

Option

Usage Examples

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages