Skip to content

Conversation

@tim-band
Copy link
Collaborator

@tim-band tim-band commented Dec 10, 2025

DuckDB does not work without this change; it uses the PostgreSQL dialect with minor changes, but it really needs a couple more.
This change adds DuckDB as a SQLAlchemy plugin, and hooks into the SQL compilation process removing the PostgreSQL code that DuckDB does not understand.

dump-data has also been updated to allow the dumping of all non-ignored non-vocabulary tables in one call, and also to dump the data as Parquet.

So with dump-data for the destination and DuckDB's in-memory database for the source it is now possible to do Parquet-to-Parquet data faking without interacting directly with DuckDB at all! See duckdb.rst for details.

@tim-band
Copy link
Collaborator Author

Finally this all works! It's actually fairly easy to fake parquet files now; see the duckdb.rst file for details.

@stefpiatek
Copy link

Ahh nice, I'll pencil in some time next week hopefully to review. Appreciate it Tim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants