Skip to content

testuteab/reconlify-cli

Repository files navigation

Home Page

Reconlify CLI

Semantic data reconciliation for the command line.

Validate structured datasets using declarative YAML rules and produce deterministic JSON reconciliation reports suitable for CI/CD pipelines.

Fully local. No data leaves your machine.

Typical use cases:

  • ETL validation
  • Data migration verification
  • Financial transaction reconciliation
  • CI pipeline dataset checks
  • Log comparison with normalization rules

Quick Example

source.csv

txn_id,amount
1,100
2,200

target.csv

txn_id,amount
1,100
2,210

config.yaml

type: tabular
source: source.csv
target: target.csv
keys:
  - txn_id
reconlify run config.yaml

Exit code: 1 (differences found). Report highlights:

rows_with_mismatches: 1
missing_in_source:    0
missing_in_target:    0

How Reconlify Compares

Reconlify is not just another diff tool.

It performs semantic reconciliation for structured data files.

Capability diff csvdiff Excel Compare Beyond Compare Datafold Reconlify
Understands tabular datasets No Yes Yes Yes Yes Yes
Key-based row matching No Yes Manual Yes Yes Yes
Detects missing rows No Yes Manual Partial Yes Yes
Column-level mismatch detection No Yes Manual Partial Yes Yes
Rule-based normalization No No No No No Yes
Regex transformations No No No No No Yes
Numeric tolerance No No No Yes Yes Yes
Noise filtering No No Manual Manual Partial Yes
Deterministic JSON reconciliation report No No No No Partial Yes
Works with exported files Yes Yes Yes Yes No Yes
Database integration No No No No Yes Planned
CI/CD automation ready Yes Partial No No Yes Yes
Local-first execution Yes Yes Yes Yes No Yes

Reconlify focuses on semantic reconciliation for structured files such as CSV exports, logs, and tabular datasets.

While tools like Datafold specialize in comparing database tables inside data warehouses, Reconlify focuses on validating exported datasets produced by pipelines, migrations, or financial systems.

This makes it particularly useful for:

  • QA validation of data migrations
  • regression testing for ETL pipelines
  • reconciliation of exported reports
  • financial and operational data audits

Features

  • Key-based dataset reconciliation (single or composite keys)
  • Automatic missing-row detection (both directions)
  • Column-level mismatch detection with include/exclude control
  • Numeric tolerance support (per-column absolute tolerance)
  • Normalization rules (trim, case-insensitive, null handling, regex, virtual columns)
  • Row filters and exclusions
  • Deterministic JSON reconciliation reports
  • Machine-readable exit codes (0 / 1 / 2)
  • Two engines: tabular (CSV/TSV) and text (line-by-line / unordered)
  • CI/CD pipeline friendly
  • Fully local execution — no network calls

Performance

Reconlify uses DuckDB-backed tabular processing, streaming text comparison, and a local-first architecture. It processes large datasets locally without requiring a database or external service.

Dataset Rows / Lines Mode Time
CSV reconciliation (exact match) 200k rows tabular ~2 s
CSV reconciliation (high mismatch) 200k rows tabular ~12 s
Log comparison (positional diffs) 500k lines line_by_line ~3 s
Log comparison (unordered) 250k lines unordered_lines < 1 s

Benchmarks were executed on a MacBook (Apple Silicon / Python 3.11) with default fixture settings.

Performance depends on dataset structure, rule complexity, and system hardware.

Full benchmark methodology and results: PERF_TESTING.md

Installation

Requires Python 3.11+.

pip install reconlify-cli

Or with pipx for isolated installs:

pipx install reconlify-cli

Package name on PyPI: reconlify-cli

For development:

git clone https://github.com/testuteab/reconlify-cli.git && cd reconlify-cli
make install          # runs: poetry install
reconlify --help

CLI Usage

reconlify run <config.yaml>                # default output: report.json
reconlify run <config.yaml> --out out.json # custom output path

Options

Option Default Description
--out PATH report.json Output path for the JSON report
--include-line-numbers / --no-include-line-numbers on Include original line numbers in text report samples
--max-line-numbers N 0 (unlimited) Max line numbers per distinct line in unordered mode
--debug-report off Include processed line numbers in text report samples

Exit Codes

Code Meaning
0 No differences found
1 Differences found
2 Error (config validation, file not found, runtime failure)

Version

reconlify --version
# reconlify 0.1.0

Reconlify follows Semantic Versioning: MAJOR.MINOR.PATCH.

Real-World Example: Accounting Transaction Reconciliation

A finance team exports transactions from two systems (ERP and bank ledger) and needs to reconcile them nightly. Here's how Reconlify handles it.

1. Create sample data

cat <<'EOF' > source.csv
txn_id,booking_date,account,amount,currency,counterparty,status,memo
TXN-001,2026-01-15,4100,1500.00,USD,Acme Corp,BOOKED,Invoice 9201
TXN-002,2026-01-16,4100,320.50,EUR,Globex Inc,BOOKED,Wire transfer
TXN-003,2026-01-17,4200,75.00,USD,Jane Doe,BOOKED,Expense report
TXN-004,2026-01-18,4100,10000.00,USD,Initech,CANCELLED,Reversed
TXN-005,2026-01-19,4300,249.99,USD,Umbrella Ltd,BOOKED,Subscription
TXN-006,2026-01-20,4100,,USD,Soylent Corp,BOOKED,Pending allocation
EOF

cat <<'EOF' > target.csv
txn_id,booking_date,account,amount,currency,counterparty,status,memo
TXN-001,2026-01-15,4100,1500.00,USD,  acme corp  ,BOOKED,Invoice 9201
TXN-002,2026-01-16,4100,320.75,EUR,Globex Inc,BOOKED,Wire transfer
TXN-003,2026-01-17,4200,75.00,USD,Jane Doe,BOOKED,Expense report
TXN-005,2026-01-19,4300,249.99,USD,Umbrella Ltd,BOOKED,Subscription
TXN-006,2026-01-20,4100,NULL,USD,Soylent Corp,BOOKED,Pending allocation
TXN-007,2026-01-21,4100,430.00,USD,Wayne Ent,BOOKED,New deposit
EOF

What's different between the two files:

Scenario Row Detail
Matches after normalization TXN-001 counterparty has extra spaces + lowercase in target — should match with trim + case rules
Value mismatch TXN-002 amount is 320.50 vs 320.75 (exceeds tolerance)
Exact match TXN-003 Identical
Filtered out TXN-004 Status CANCELLED — excluded by row filter
Exact match TXN-005 Identical
NULL normalization TXN-006 Source has blank amount, target has NULL string — should match
Missing in source TXN-007 Exists only in target

2. Create the reconciliation config

cat <<'EOF' > recon.yaml
type: tabular
source: source.csv
target: target.csv

keys:
  - txn_id

csv:
  delimiter: ","
  header: true
  encoding: utf-8

compare:
  trim_whitespace: true
  case_insensitive: true
  normalize_nulls: ["", "NULL", "null"]
  exclude_columns:
    - memo

filters:
  row_filters:
    both:
      - column: status
        op: equals
        value: CANCELLED
EOF

Key config choices:

  • keys: [txn_id] — match rows by transaction ID
  • trim_whitespace + case_insensitive — ignore formatting noise
  • normalize_nulls — treat blank, NULL, and null as equivalent
  • exclude_columns: [memo] — skip free-text fields
  • row_filters — drop CANCELLED transactions before comparison

3. Run the reconciliation

reconlify run recon.yaml --out report.json
echo $?

Expected exit code: 1 (differences found).

The report will contain:

  • missing_in_target: 0 — TXN-004 was filtered out, so not counted
  • missing_in_source: 1 — TXN-007 exists only in target
  • rows_with_mismatches: 1 — TXN-002 has an amount mismatch (320.50 vs 320.75)

TXN-001 and TXN-006 match cleanly thanks to normalization rules.

The report is written to report.json. It is deterministic — identical inputs and config always produce identical output (except the generated_at timestamp).

Text Engine

Reconlify also compares text files line-by-line or as unordered line sets:

type: text
source: expected.log
target: actual.log
mode: unordered_lines
normalize:
  trim_lines: true
  ignore_blank_lines: true
reconlify run text_recon.yaml

In unordered_lines mode, line order is ignored — Reconlify compares occurrence counts of each distinct line. In line_by_line mode (default), lines are compared positionally.

See the examples/ directory for config samples.

Engines

Tabular Engine

  • Key-based reconciliation — Single or composite keys. Detects missing rows on either side and cell-level value mismatches.
  • Column control — Include or exclude specific columns from comparison.
  • Numeric tolerance — Absolute tolerance per column (e.g. amount: 0.01).
  • String rules — Per-column normalization: trim, case_insensitive, contains, regex_extract.
  • Source-side normalization — Virtual columns via ops: map, concat, substr, add, sub, mul, div, coalesce, date_format, upper, lower, trim, round.
  • Row filters — Exclude specific key values or filter rows by column rules.
  • TSV support — Set csv.delimiter: "\t".

Text Engine

  • Two comparison modes: line_by_line (positional) and unordered_lines (multiset).
  • Normalization: trim_lines, collapse_whitespace, case_insensitive, ignore_blank_lines, normalize_newlines.
  • Regex rules: drop_lines_regex to remove lines, replace_regex to transform lines before comparison.

Report Format

Every run produces a JSON report with a consistent structure:

Section Description
summary Aggregate counts (rows, mismatches, missing). Zero differences = exit code 0
details Metadata: keys used, columns compared, filters applied, per-column mismatch stats
samples Concrete examples of differences (tabular: missing_in_target, missing_in_source, value_mismatches, excluded; text: flat list or samples_agg)
error Present only on exit code 2. Machine-readable code, human-readable message, and details
warnings Optional list of warning strings (e.g. large line-number arrays in unordered mode)

CI Usage

reconlify run recon.yaml --out report.json
rc=$?

if [ $rc -eq 2 ]; then
  echo "ERROR: config or runtime failure" >&2
  exit 1
elif [ $rc -eq 1 ]; then
  echo "WARN: differences found — see report.json" >&2
fi

Exit code 1 means differences were found — your pipeline decides whether that's a warning or a failure. Exit code 2 is always an error.

GitHub Actions

- name: Reconcile data
  run: |
    reconlify run recon.yaml --out report.json
    exit_code=$?
    if [ $exit_code -eq 2 ]; then
      echo "::error::Reconciliation failed with error"
      exit 1
    elif [ $exit_code -eq 1 ]; then
      echo "::warning::Differences found — see report.json"
    fi

- name: Upload report
  if: always()
  uses: actions/upload-artifact@v4
  with:
    name: recon-report
    path: report.json

Documentation

Reconlify Desktop

Reconlify Desktop is a graphical interface for Reconlify CLI. It allows users to:

  • Visually build YAML reconciliation configs
  • Run reconciliations without using the terminal
  • Inspect reconciliation reports interactively

Reconlify CLI remains the core reconciliation engine.

Development

make install       # install dependencies
make test          # unit + integration tests (excludes e2e and perf)
make e2e           # end-to-end CLI tests
make lint          # ruff linter
make format        # auto-fix lint + format
make clean         # remove build artifacts and caches

Performance Testing

make perf          # generate fixtures + run full benchmark suite
make perf-smoke    # lightweight perf smoke tests
make perf-clean    # remove generated fixtures

See Performance Testing for details and baseline results.

Changelog

See CHANGELOG.md for release history.


Reconlify sits between simple file diff tools and heavy enterprise reconciliation systems, providing a deterministic, developer-friendly workflow for validating structured data locally.

License

Released under the MIT License. Copyright (c) 2026 Testute AB.