Skip to content

graphsense/graphsense-lib

Repository files navigation

GraphSense Library

Test and Build Status PyPI version Python Downloads

A comprehensive Python library for the GraphSense crypto-analytics platform. It provides database access, data ingestion, maintenance tools, and analysis capabilities for cryptocurrency transactions and networks.

Note: This library uses optional dependencies. Use graphsense-lib[all] to install all features.

Quick Start

Installation

# Install with all features
uv add graphsense-lib[all]

# Install from source
git clone https://github.com/graphsense/graphsense-lib.git
cd graphsense-lib
make install

Serving the REST API locally

The web API requires two backend connections: a Cassandra cluster (blockchain data) and a TagStore (PostgreSQL). You can configure them via environment variables or a YAML config file.

Option A: Environment variables only

GS_CASSANDRA_ASYNC_NODES='["<cassandra-host>"]' \
GRAPHSENSE_TAGSTORE_READ_URL='postgresql+asyncpg://<user>:<password>@<host>:<port>/tagstore' \
uv run --extra web uvicorn graphsenselib.web.app:create_app --factory --host localhost --port 9000 --reload

Option B: YAML config file

Point CONFIG_FILE to a REST-specific config (see instance/config.yaml for a full example):

CONFIG_FILE=./instance/config.yaml make serve-web

Or without Make:

CONFIG_FILE=./instance/config.yaml \
uv run --extra web uvicorn graphsenselib.web.app:create_app --factory --host localhost --port 9000 --reload

Option C: .graphsense.yaml with a web key

If you already have a .graphsense.yaml (or ~/.graphsense.yaml) for the CLI, you can add a web key containing the REST config. The app will pick it up automatically without setting CONFIG_FILE:

# .graphsense.yaml
environments:
  # ... your existing CLI config ...

web:
  database:
    nodes: ["<cassandra-host>"]
    currencies:
      btc:
      eth:
  gs-tagstore:
    url: "postgresql+asyncpg://<user>:<password>@<host>:<port>/tagstore"
make serve-web

Config resolution order: explicit config_file param > CONFIG_FILE env var > ./instance/config.yaml > .graphsense.yaml web key > env vars only.

Optional REST settings (env vars)

Variable Default Description
GSREST_DISABLE_AUTH false Disable API key authentication
GSREST_ALLOWED_ORIGINS * CORS allowed origins
GSREST_LOGGING_LEVEL Logging level (DEBUG, INFO, …)
GS_CASSANDRA_ASYNC_PORT 9042 Cassandra port
GS_CASSANDRA_ASYNC_USERNAME Cassandra username
GS_CASSANDRA_ASYNC_PASSWORD Cassandra password

Basic Usage

Database Access with Configuration File

from graphsenselib.db import DbFactory

# Using GraphSense config file (default: ~/.graphsense.yaml)
with DbFactory().from_config("development", "btc") as db:
    highest_block = db.transformed.get_highest_block()
    print(f"Highest BTC block: {highest_block}")

    # Get block details
    block = db.transformed.get_block(100000)
    print(f"Block 100000: {block.block_hash}")

Direct Database Connection

from graphsenselib.db import DbFactory

# Direct connection without config file
with DbFactory().from_name(
    raw_keyspace_name="eth_raw",
    transformed_keyspace_name="eth_transformed",
    schema_type="account",
    cassandra_nodes=["localhost"],
    currency="eth"
) as db:
    print(f"Highest block: {db.transformed.get_highest_block()}")

Async Database Services

The async services are used internally by the REST API and can also be used standalone. AddressesService depends on several other services:

from graphsenselib.db.asynchronous.services import (
    BlocksService, AddressesService, TagsService,
    EntitiesService, RatesService,
)

# Services are initialized with their dependencies
blocks_service = BlocksService(db, rates_service, config, logger)
addresses_service = AddressesService(
    db, tags_service, entities_service, blocks_service, rates_service, logger
)

address_info = await addresses_service.get_address("btc", "1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa")
txs = await addresses_service.list_address_txs("btc", "1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa")

Command Line Interface

GraphSense-lib exposes a comprehensive CLI tool: graphsense-cli

Basic Commands

# Show help and available commands
graphsense-cli --help

# Check version
graphsense-cli version

# Show current configuration
graphsense-cli config show

# Generate config template
graphsense-cli config template > ~/.graphsense.yaml

# Show config file path
graphsense-cli config path

Modules

Database Management

Query and manage the GraphSense database state.

# Show database management options
graphsense-cli db --help

# Check database state/summary
graphsense-cli db state -e development

# Get block information
graphsense-cli db block info -e development -c btc --height 100000

# Query logs (for Ethereum-based chains)
graphsense-cli db logs -e development -c eth --from-block 1000000 --to-block 1000100

Schema Operations

Create and validate database schemas.

# Show schema options
graphsense-cli schema --help

# Create database schema for a currency
graphsense-cli schema create -e dev -c btc

# Validate existing schema
graphsense-cli schema validate -e dev -c btc

# Show expected schema for currency
graphsense-cli schema show-by-currency btc

# Show schema by type (utxo/account)
graphsense-cli schema show-by-schema-type utxo

Data Ingestion

Ingest raw cryptocurrency data from nodes.

# Show ingestion options
graphsense-cli ingest --help

# Ingest blocks from cryptocurrency node
graphsense-cli ingest from-node \
    -e dev \
    -c btc \
    --start-block 0 \
    --end-block 1000 \
    --create-schema

# Ingest with custom batch size
graphsense-cli ingest from-node \
    -e dev \
    -c eth \
    --start-block 1000000 \
    --end-block 1001000 \
    --batch-size 100

Delta Updates

Update transformed keyspace from raw keyspace.

# Show delta update options
graphsense-cli delta-update --help

# Check update status
graphsense-cli delta-update status -e dev -c btc

# Perform delta update
graphsense-cli delta-update update -e dev -c btc

# Validate delta update consistency
graphsense-cli delta-update validate -e dev -c btc

# Patch exchange rates for specific blocks
graphsense-cli delta-update patch-exchange-rates \
    -e dev \
    -c btc \
    --start-block 100000 \
    --end-block 200000

Exchange Rates

Fetch and ingest exchange rates from various sources.

# Show exchange rate options
graphsense-cli exchange-rates --help

# Fetch from CoinDesk
graphsense-cli exchange-rates coindesk -e dev -c btc

# Fetch from CoinMarketCap (requires API key in config)
graphsense-cli exchange-rates coinmarketcap -e dev -c btc

Monitoring

Monitor GraphSense infrastructure health and state.

# Show monitoring options
graphsense-cli monitoring --help

# Get database summary
graphsense-cli monitoring get-summary -e dev

# Get summary for specific currency
graphsense-cli monitoring get-summary -e dev -c btc

# Send notifications to configured handlers
graphsense-cli monitoring notify \
    --topic "database-update" \
    --message "BTC ingestion completed"

Event Watching (Alpha)

Watch for cryptocurrency events and generate notifications.

# Show watch options
graphsense-cli watch --help

# Watch for money flows on specific addresses
graphsense-cli watch money-flows \
    -e dev \
    -c btc \
    --address 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa \
    --threshold 1000000  # satoshis

File Conversion Tools

Convert between different file formats.

# Show conversion options
graphsense-cli convert --help

Configuration

GraphSense-lib uses a YAML configuration file that defines database connections and environment settings. Default locations: ./.graphsense.yaml, ~/.graphsense.yaml.

Generate Configuration Template

graphsense-cli config template > ~/.graphsense.yaml

Example Configuration Structure

# Optional: default environment to use
default_environment: dev

environments:
  dev:
    # Cassandra cluster configuration
    cassandra_nodes: ["localhost"]
    port: 9042
    # Optional authentication
    # username: "cassandra"
    # password: "cassandra"

    # Currency/keyspace configurations
    keyspaces:
      btc:
        raw_keyspace_name: "btc_raw"
        transformed_keyspace_name: "btc_transformed"
        schema_type: "utxo"

        # Node connection for ingestion
        ingest_config:
          node_reference: "http://localhost:8332"
          # Optional authentication for node
          # username: "rpcuser"
          # password: "rpcpassword"

        # Keyspace setup for schema creation
        keyspace_setup_config:
          raw:
            replication_config: "{'class': 'SimpleStrategy', 'replication_factor': 1}"
          transformed:
            replication_config: "{'class': 'SimpleStrategy', 'replication_factor': 1}"

      eth:
        raw_keyspace_name: "eth_raw"
        transformed_keyspace_name: "eth_transformed"
        schema_type: "account"

        ingest_config:
          node_reference: "http://localhost:8545"

        keyspace_setup_config:
          raw:
            replication_config: "{'class': 'SimpleStrategy', 'replication_factor': 1}"
          transformed:
            replication_config: "{'class': 'SimpleStrategy', 'replication_factor': 1}"

  prod:
    cassandra_nodes: ["cassandra1.prod", "cassandra2.prod", "cassandra3.prod"]
    username: "gs_user"
    password: "secure_password"

    keyspaces:
      btc:
        raw_keyspace_name: "btc_raw"
        transformed_keyspace_name: "btc_transformed"
        schema_type: "utxo"

        ingest_config:
          node_reference: "http://bitcoin-node.internal:8332"

        keyspace_setup_config:
          raw:
            replication_config: "{'class': 'NetworkTopologyStrategy', 'datacenter1': 3}"
          transformed:
            replication_config: "{'class': 'NetworkTopologyStrategy', 'datacenter1': 3}"

# Optional: Slack notification configuration
slack_topics:
  database-update:
    hooks: ["https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"]

  payment_flow_notifications:
    hooks: ["https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"]

# Optional: API keys for external services
coingecko_api_key: ""
coinmarketcap_api_key: "YOUR_CMC_API_KEY"

# Optional: cache directory for temporary files
cache_directory: "~/.graphsense/cache"

Advanced Features

Tagpack Management

GraphSense-lib includes comprehensive tagpack management tools (formerly standalone tagpack-tool). For detailed documentation, see Tagpack README.

# Validate tagpacks
graphsense-cli tagpack-tool tagpack validate /path/to/tagpack

# Insert tagpack into tagstore
graphsense-cli tagpack-tool insert \
    --url "postgresql://user:pass@localhost/tagstore" \
    /path/to/tagpack

# Show quality measures
graphsense-cli tagpack-tool quality show-measures \
    --url "postgresql://user:pass@localhost/tagstore"

Tagstore Operations

# Initialize tagstore database
graphsense-cli tagstore init

# Initialize with custom database URL
graphsense-cli tagstore init --db-url "postgresql://user:pass@localhost/tagstore"

# Get DDL SQL for manual setup
graphsense-cli tagstore get-create-sql

Cross-chain Analysis

# Using an initialized AddressesService (see above for setup)
related = await addresses_service.get_cross_chain_pubkey_related_addresses(
    "1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa"
)

for addr in related:
    print(f"Network: {addr.network}, Address: {addr.address}")

Function Call Parsing

from graphsenselib.utils.function_call_parser import parse_function_call

# Parse Ethereum function calls
function_signatures = {
    "0xa9059cbb": [{
        "name": "transfer",
        "inputs": [
            {"name": "to", "type": "address"},
            {"name": "value", "type": "uint256"}
        ]
    }]
}

parsed = parse_function_call(tx_input_bytes, function_signatures)
if parsed:
    print(f"Function: {parsed['name']}")
    print(f"Parameters: {parsed['parameters']}")

Development

Important: Requires Python >=3.10, <3.13.

Setup Development Environment

# Initialize development environment (installs deps + pre-commit hooks)
make dev

# Or install dev dependencies only
make install-dev

Code Quality and Testing

Before committing, please format, lint, and test your code:

# Format code
make format

# Lint code
make lint

# Run fast tests
make test

# Or run all steps at once
make pre-commit

For comprehensive testing:

# Run complete test suite (including slow tests)
make test

Release Process

This repository uses two source-of-truth versions in the root Makefile:

  • Library version: RELEASESEM (released with vX.Y.Z, vX.Y.Z-rc.N, or vX.Y.Z-dev.N tags)
  • OpenAPI/API version: WEBAPISEM (written to src/graphsenselib/web/version.py)

The Python client package version is derived from the API version and should match it.

Library package versioning is dynamic via setuptools_scm (pyproject.toml):

  • Git tag v2.9.8 -> package version 2.9.8
  • Git tag v2.9.8-rc.1 -> package version 2.9.8rc1
  • Git tag v2.9.8-dev.1 -> package version 2.9.8.dev1
  • Commits after a tag append local metadata, for example 2.9.8.dev1+g<sha>.d<date>

Use the root Makefile helpers:

# Show all current versions
make show-versions

# Update and validate OpenAPI contract version
make update-api-version WEBAPISEM=v2.10.0
make check-api-version WEBAPISEM=v2.10.0

# Sync client version from API version and validate
make sync-client-version WEBAPISEM=v2.10.0
make check-client-version WEBAPISEM=v2.10.0

# Generate Python client (package version = OpenAPI info.version)
make generate-python-client

# Create both release tags from Makefile versions
make tag-version

Tagging behavior:

  • Library release tag: vX.Y.Z, vX.Y.Z-rc.N, or vX.Y.Z-dev.N (from RELEASESEM)
  • Client release tag: webapi-vA.B.C (from WEBAPISEM)

Recommended library versioning routine:

  1. For development prereleases, set RELEASESEM to vX.Y.Z-dev.N (for example v2.10.0-dev.1)
  2. For release candidates, set RELEASESEM to vX.Y.Z-rc.N
  3. For stable releases, set RELEASESEM to vX.Y.Z
  4. Create tags with make tag-version
  5. Push tags with git push origin --tags

CI trigger background:

  • Stable library tags (vX.Y.Z) trigger:
    • GitHub Release creation
    • Python library package build/publish (graphsense-lib)
    • Docker image build/publish
  • Client tags (webapi-vA.B.C) trigger Python client package build/publish (clients/python)
  • Other library tags (vX.Y.Z-rc.N, vX.Y.Z-dev.N) do not trigger GitHub Release or Python package publish; they only trigger Docker image build/publish
  1. Update CHANGELOG.md with new features and fixes
  2. Update relevant versions (library/API/client) based on what changed
  3. Sync API/client versions if needed (make update-api-version + make sync-client-version)
  4. Create and push tags:
make tag-version
git push origin --tags

Troubleshooting

OpenSSL Errors

Some components use OpenSSL hash functions that aren't available by default in OpenSSL 3.0+ (e.g., ripemd160). This can cause test suite failures. To fix this, enable legacy providers in your OpenSSL configuration. See the "fix openssl legacy mode" step in .github/workflows/run_tests.yaml for an example.

Common Issues

  1. Connection Refused: Verify Cassandra is running and accessible
  2. Schema Validation Errors: Ensure database schema matches expected version
  3. Import Errors: Install with [all] option for complete feature set
  4. Python Version: Requires Python >=3.10, <3.13

Getting Help

License

See LICENSE file for licensing details.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run make pre-commit to ensure code quality
  5. Submit a pull request

GraphSense - Open Source Crypto Analytics Platform Website: https://graphsense.github.io/

About

A central repository for Python utility functions and all components that interact with the GraphSense backend. The repository provides a CLI interface for managing essential GraphSense maintenance tasks and provides a REST interface used by the frontend (UI). It acts as the core repository, delivering foundational tool

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors