Diff Gnome Documentation
CLI-first cross-database verification and repair. MySQL, PostgreSQL, SQL Server.
Diff Gnome is a CLI tool for verifying that two datasets actually match. It's built for real cross-database verification and repair workflows — compare two tables, dry-run or apply syncs, preflight risky jobs with doctor, clean duplicate rows with dedupe, restore backup exports, and batch everything through config files.
How It Works
Diff Gnome uses chunk-hashing to efficiently compare large tables without scanning every row individually.
compare
Compare two tables across databases. Reports schema differences, row count mismatches, and changed/missing/extra rows.
--left-driver mysql \
--left-dsn 'user:pass@tcp(localhost:3306)/appdb' \
--right-driver postgres \
--right-dsn 'postgres://user:pass@localhost:5432/warehouse?sslmode=disable' \
--table customer_orders \
--key id
Fail on any difference
Use --fail-on-diff to exit with a non-zero status when any schema, count, or row differences are found — useful in CI pipelines.
--left-driver mysql --left-dsn '...' \
--right-driver postgres --right-dsn '...' \
--table customer_orders --key id \
--fail-on-diff
Export mismatches to a file
Use --mismatch-file to export every mismatch as a JSON object per line (JSONL) while keeping the main report compact.
--left-driver mysql --left-dsn '...' \
--right-driver sqlserver --right-dsn '...' \
--table customer_orders --key id \
--mismatch-file reports/customer-orders-mismatches.jsonl
Filter rows with --where
--left-driver mysql --left-dsn '...' \
--right-driver postgres --right-dsn '...' \
--table customer_orders --key id \
--where "created_at >= '2026-01-01'"
Different table names on each side
--left-driver mysql --left-dsn '...' \
--right-driver sqlserver --right-dsn '...' \
--left-table customer_orders_archive \
--right-table customer_orders \
--key id
Composite keys
--table customer_order_lines \
--key order_id,line_number \
--left-driver mysql --left-dsn '...' \
--right-driver sqlserver --right-dsn '...'
Keyless compare
Omit --key to match rows on full row contents instead of a primary key.
--left-driver mysql --left-dsn 'user:pass@tcp(left-host:3306)/appdb' \
--right-driver mysql --right-dsn 'user:pass@tcp(right-host:3306)/appdb' \
--table event_journal
JSON output
sync
Sync differences from left to right. Always dry-run first — add --apply only when you're confident in the output.
--apply first to preview exactly what will be inserted, updated, or deleted. Diff Gnome will show you the counts before touching anything.$ ./diff-gnome sync \
--left-driver mysql --left-dsn '...' \
--right-driver postgres --right-dsn '...' \
--table customer_orders --key id \
--where "created_at >= '2026-01-01'" \
--max-writes 500
# Apply when ready
$ ./diff-gnome sync ... --apply
Insert-only (keyless)
--insert-only can run without a key — matches rows on full row contents and inserts rows that exist only on the right.
Safety flags
| Flag | Description |
|---|---|
| --max-writes N | Abort if the sync would exceed N total inserts + updates |
| --max-deletes N | Abort if the sync would exceed N deletes |
| --require-where | Refuse to run without a --where filter |
| --delete-missing | Also delete rows that exist on right but not left |
| --insert-only | Only insert missing rows, never update or delete |
| --write-batch-size N | Commit writes in batches of N rows |
doctor
A read-only preflight command for two-sided runs. Run doctor before any sync that touches production data.
Doctor checks:
--where columns are backed by leading index columnssync --apply or dedupe --apply--apply, --max-writes, --max-deletes, --require-where--left-driver sqlserver --left-dsn 'sqlserver://user:pass@sqlhost:1433?database=appdb&encrypt=disable' \
--right-driver mysql --right-dsn 'user:pass@tcp(target:3306)/appdb' \
--table candidate_records \
--key record_id \
--where "updated_at >= '2026-01-01'" \
--apply --max-writes 5000
dedupe
Finds and removes duplicate rows within a single datasource. Runs a backup export before deleting anything when --backup-file is set.
Use --ignore-column to exclude noise columns (like updated_at) from duplicate matching. --apply supports SQL Server, MySQL, and PostgreSQL.
--driver sqlserver \
--dsn 'sqlserver://user:pass@sqlhost:1433?database=appdb&encrypt=disable' \
--table monthly_statement_items \
--key account_id,statement_month,recipient_id,invoice_number \
--ignore-column updated \
--backup-file reports/monthly-statement-items-dedupe-backup.sql \
--backup-format sql \
--apply
restore
Replays a backup export (SQL or JSONL) back into a datasource. Use SQL backups for direct replay, or JSONL when you want Diff Gnome to translate structured backup records into row-level inserts, updates, and deletes.
Use --verify-only for a stronger read-only check — for JSONL backups it validates the referenced tables and columns against the target without replaying anything.
--driver sqlserver \
--dsn 'sqlserver://user:pass@sqlhost:1433?database=targetdb&encrypt=disable' \
--input-file reports/monthly-statement-items-dedupe-backup.sql \
--input-format sql \
--max-statements 500 \
--transaction \
--apply
run (batch)
Run multiple compare, sync, dedupe, restore, or doctor jobs from a single YAML or JSON config file, with bounded parallelism.
# Override parallelism from the CLI
$ ./diff-gnome run --config examples/run-config.yaml --parallel 4

diff-gnome run --config over hidden shell state. Named connections keep your DSNs out of shell history and make your jobs reproducible.Credentials & Env Vars
Keep DSNs out of your shell history by using environment variables instead of inline credentials.
$ export DIFFGNOME_RIGHT_DSN='postgres://user:pass@localhost:5432/targetdb?sslmode=disable'
$ ./diff-gnome compare \
--left-driver mysql \
--right-driver postgres \
--table customer_orders --key id
Custom env var names
Point at any environment variable name using --left-dsn-env and --left-driver-env.
$ export SOURCE_DB_DRIVER='mysql'
$ ./diff-gnome compare \
--left-driver-env SOURCE_DB_DRIVER \
--left-dsn-env SOURCE_DB_DSN \
--right-driver-env TARGET_DB_DRIVER \
--right-dsn-env TARGET_DB_DSN \
--table customer_orders --key id
Common Flags
Column handling
| Flag | Description |
|---|---|
| --ignore-column col | Exclude a column from schema comparison and row matching (useful for audit columns like changedon) |
| --date-only-column col | Normalize a column to calendar date only — handles cases where one side stores timestamps and the other stores dates |
| --equivalent-value '...' | Treat rewritten display values as equal. Format: col[,col]:value=value|value=value |
| --empty-string-as-null | Treat empty strings and NULLs as equivalent during matching |
Ignore timestamp precision differences
--left-driver mysql --left-dsn '...' \
--right-driver sqlserver --right-dsn '...' \
--table customer_orders --key id \
--ignore-column changedon
Normalize timestamps to date only
--left-table reporting.activity_log_archive \
--right-table activity_log \
--ignore-column changedon \
--date-only-column changedate
Treat empty strings as NULL
Output Formats
All commands support --output text (default), --output json, or --output markdown.
| Format | Use case |
|---|---|
| text | Human-readable terminal output |
| json | Machine-readable, pipe to jq or monitoring tools |
| markdown | Reports stored in repos or rendered in CI |
Type Matching
Type matching is loose by default. Loose mode handles common cross-database type differences automatically:
SMALLDATETIME, DATETIME, DATETIME2, TIMESTAMP, TIMESTAMPTZ are treated as compatibleVARCHAR vs DATE)INT, UNSIGNED INT, TINYINT are treated as compatibleVARCHAR and TEXT are treated as compatible1.16 and 1.160 compare as equalUse --strict-types or strict_types: true in config when you want exact type matches.
Config File Reference
Run config files (YAML or JSON) support the following top-level fields:
| Field | Description |
|---|---|
| parallel | How many jobs run at once |
| continue_on_error | Keep going after a job fails |
| animate | Show animated batch progress on stderr |
| summary_only | Suppress per-job output; show only live progress and final summary |
| output_dir | Directory for generated job reports and mismatch exports |
| summary_file | Write a consolidated run summary to output_dir/run-summary.txt |
| summary_format | text, json, or markdown |
| exception_file | Write a consolidated failure report to output_dir/run-exceptions.txt |
| exception_format | text, json, or markdown |
| connections | Named connection blocks — avoids repeating DSNs across jobs |
| defaults | Shared settings applied to all jobs unless overridden |
| jobs | List of jobs: mode: compare, sync, dedupe, restore, or doctor |
Job fields mirror CLI flags using snake_case. Two-sided jobs use left and right connection references. Single-datasource jobs (dedupe, restore) use connection.
Per-job file outputs
| Field | Description |
|---|---|
| output_file: true | Write a per-job report to output_dir/<job-name>.txt|.json|.md |
| event_file: true | Write a machine-readable JSONL event stream to output_dir/<job-name>-events.jsonl |
| mismatch_file: true | Write every mismatch as one JSON object per line to output_dir/<job-name>-mismatches.jsonl (compare jobs only) |
| backup_file: true | Auto-generate a backup file path in output_dir (sync and dedupe jobs only) |
| backup_format | jsonl or sql |
Full Config Example
continue_on_error: true
animate: true
summary_only: true
output_dir: reports
summary_file: true
summary_format: markdown
exception_file: true
exception_format: json
connections:
target_sqlserver:
driver: sqlserver
dsn_env: DIFFGNOME_LEFT_DSN
source_mysql:
driver: mysql
dsn_env: DIFFGNOME_RIGHT_DSN
defaults:
left: target_sqlserver
right: source_mysql
output: text
chunk_size: 10000
require_where: true
output_file: true
mismatch_file: true
backup_file: true
backup_format: sql
jobs:
- name: audit-trail-compare
mode: compare
left_table: reporting.activity_log_archive
right_table: activity_log
ignore_columns: changedon
date_only_columns: changedate
- name: candidate-doctor
mode: doctor
table: candidate_records
key: record_id
where: "updated_at >= '2026-01-01'"
apply: true
max_writes: 5000
max_deletes: 100
- name: candidate-sync
mode: sync
table: candidate_records
key: record_id
where: "updated_at >= '2026-01-01'"
apply: true
write_batch_size: 1000
output: json
Constraints
compare can run without a key by matching rows on full row contentssync --insert-only can run without a key by matching on full row contents--left-table / --right-table when source and target table names differ--where clause is trusted SQL and passed through to both databases unchangedTOP n ordered by the configured key columnsLicense
Diff Gnome is a commercial tool. A 14-day free trial is included with every download — no credit card required. Licenses are activated per machine.
$ ./diff-gnome license activate --key YOUR-LICENSE-KEY
# Check status and force a live refresh
$ ./diff-gnome license status --check
# Deactivate before swapping machines
$ ./diff-gnome license deactivate
# Open the purchase page
$ ./diff-gnome license purchase