Skip to content

Merge saas branch: add SaaS infrastructure#143

Closed
Miyamura80 wants to merge 214 commits intomainfrom
feat/merge-saas-infrastructure
Closed

Merge saas branch: add SaaS infrastructure#143
Miyamura80 wants to merge 214 commits intomainfrom
feat/merge-saas-infrastructure

Conversation

@Miyamura80
Copy link
Owner

Summary

  • Intelligently merges the saas branch into main, preserving main's maintained infrastructure while bringing in SaaS-specific features
  • Adds database layer (SQLAlchemy models, Alembic migrations with RLS support), Stripe payments, WorkOS auth, agent chat with streaming, Telegram integration, and Railway deployment
  • All SaaS-specific config fields and env vars are optional, so non-SaaS usage remains unaffected

Merge Strategy

  • Kept main's version for: cursor rules, GitHub workflows, AGENTS.md, TODO.md, README.md, test files — main is more maintained
  • Kept saas's version for: all DB/migration infrastructure (alembic/, src/db/, src/api/, src/stripe/), new SaaS-specific files
  • Manual merge for 7 key files:
    • config_models.py — combined both sets of Pydantic models (main's FeaturesConfig/RedactionConfig + saas's SaaS models)
    • global_config.py — main's split YAML loading + relative imports + saas's api_base()/DB URI resolution/SaaS config fields (all optional)
    • global_config.yaml — main's model names + redaction + saas's timeout/agent_chat/subscription/stripe/telegram/server
    • Makefile — main's import_lint/ci targets + saas's DB section
    • pyproject.toml — main's newer deps + saas's SaaS deps + merged vulture excludes
    • dspy_inference.py — main's fallback/retry/feature flags + saas's lazy init/streaming/timeout
    • dspy_langfuse.py — main's langfuse v3 API + saas's explicit trace context/call_id matching

Test plan

  • Verify uv lock succeeds after pyproject.toml changes
  • Run make test_fast to ensure existing tests pass
  • Verify SaaS config fields load correctly from YAML
  • Verify non-SaaS usage still works (SaaS fields default to None)

🤖 Generated with Claude Code

google-labs-jules bot and others added 26 commits December 26, 2025 13:23
- Add referral columns to Profiles model (code, referrer, count)
- Create ReferralService for managing referrals
- Add API endpoints for applying codes and viewing stats
- Centralize profile creation logic with ensure_profile_exists
- Generate migration script (unapplied) with data backfill support
- Make referral_code nullable and remove default generation
- Implement lazy generation in ReferralService and API
- Use db_transaction in ensure_profile_exists
- Update migration script to reflect nullable columns
- Handle feedback from PR review
- Make referral_code nullable and remove default generation
- Implement lazy generation in ReferralService and API
- Use db_transaction in ensure_profile_exists
- Update migration script to reflect nullable columns
- Handle feedback from PR review
- Fix linting and type errors
- Add unit tests for referral system
- Updated `alert_admin` tool to include the user's email in the success log message.
- Removed the Telegram message ID from the log message as requested.
- Verified that `user_profile` is correctly retrieved and used.
- Passed all existing tests for `alert_admin`.
- Make referral_code nullable and remove default generation
- Implement lazy generation in ReferralService and API
- Use db_transaction in ensure_profile_exists
- Update migration script to reflect nullable columns
- Handle feedback from PR review
- Fix linting and type errors
- Add unit tests for referral system
- Make referral_code nullable and remove default generation
- Implement lazy generation in ReferralService and API
- Use db_transaction in ensure_profile_exists
- Update migration script to reflect nullable columns
- Handle feedback from PR review
- Fix linting and type errors
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
- Make referral_code nullable and remove default generation
- Implement lazy generation in ReferralService and API
- Use db_transaction in ensure_profile_exists
- Update migration script to reflect nullable columns
- Handle feedback from PR review
- Fix linting and type errors
- Add unit tests for referral system
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
- Remove manual collision check in ReferralService
- Handle IntegrityError instead
- Update get_or_create_referral_code logic
- Wrap database updates in db_transaction to ensure atomicity
- Prevents partial state if validation logic fails after commit (though logically validated before)
- Aligns with codebase patterns
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
- Make referral_code nullable and remove default generation
- Implement lazy generation in ReferralService and API
- Use db_transaction in ensure_profile_exists
- Update migration script to reflect nullable columns
- Handle feedback from PR review
- Fix linting and type errors
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
- Implement atomic increment for referral count to prevent race conditions
- Clean up comments in ReferralService
- Verify CI passes
…2664991858159

Referral System Implementation
Introduces a new GET endpoint to retrieve daily rate limit usage, providing visibility into current consumption, remaining quota, and reset times.

- Adds `AgentLimitResponse` Pydantic model for standardized responses.
- Implements `get_agent_limits` route using `ensure_daily_limit`.
- Adds E2E tests for authenticated and unauthenticated access.
Introduces a new GET endpoint to retrieve daily rate limit usage, providing visibility into current consumption, remaining quota, and reset times.

- Adds `AgentLimitResponse` Pydantic model for standardized responses.
- Implements `get_agent_limits` route using `ensure_daily_limit` with `enforce=False`.
- Adds E2E tests for authenticated and unauthenticated access.
- Validated via `make ci` and `make test`.
…25189082607

Add agent limits status endpoint
- Fixed a critical security vulnerability in `workos_auth.py` where authentication could be bypassed in production if the script name contained "test". The check is now gated by `DEV_ENV != "prod"`.
- Moved hardcoded CORS origins from `src/server.py` to `global_config.yaml` to support configurable environments.
- Enabled access logs in `uvicorn` for better production observability.
- Updated `common/config_models.py` to include `ServerConfig`.
…ability-11261091240424952918

🔒 Fix critical auth vulnerability and improve observability
…eaming)

Intelligently merges the saas branch into main, keeping main's maintained
infrastructure while bringing in SaaS-specific features including database
models, alembic migrations, Stripe payments, WorkOS auth, agent chat with
streaming, Telegram integration, and Railway deployment support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
"""
Return a deterministic SHA-256 hash for an API key.
"""
return hashlib.sha256(api_key.encode("utf-8")).hexdigest()

Check failure

Code scanning / CodeQL

Use of a broken or weak cryptographic hashing algorithm on sensitive data High

Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.
Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.
Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.
Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.
Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.
Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.
Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.

Copilot Autofix

AI about 13 hours ago

General approach: Replace the direct use of hashlib.sha256 with a password‑hashing or key‑derivation function that is computationally expensive and parameterizable. Because we only need deterministic, equality‑checkable hashes in the database (not reversible encryption), PBKDF2 from the standard library (hashlib.pbkdf2_hmac) is a good choice that avoids adding non‑stdlib dependencies. We will introduce a module‑level, fixed salt and iteration count for API key hashing, and use PBKDF2‑HMAC‑SHA256 to derive a hash string. This keeps behavior (store hash, compare hash) intact while strengthening resistance to offline brute‑force attacks.

Concrete changes in this file:

  1. Add constants for an API‑key hashing salt, iteration count, and output length near the other module‑level constants.
  2. Rewrite hash_api_key to:
    • Use hashlib.pbkdf2_hmac("sha256", ...) with the defined salt and iteration count.
    • Hex‑encode the resulting bytes (.hex()), preserving the string nature of key_hash.
  3. Leave the surrounding logic in create_api_key and validate_api_key unchanged, since they already rely solely on the deterministic hash string.

This change is fully local to src/api/auth/api_key_auth.py and keeps the DB schema (key_hash as a string) and external behavior intact. The only imports needed come from hashlib, which is already imported.


Suggested changeset 1
src/api/auth/api_key_auth.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/src/api/auth/api_key_auth.py b/src/api/auth/api_key_auth.py
--- a/src/api/auth/api_key_auth.py
+++ b/src/api/auth/api_key_auth.py
@@ -20,16 +20,28 @@
 API_KEY_PREFIX = "sk_"
 KEY_PREFIX_LENGTH = 8
 
+# Parameters for API key hashing. PBKDF2 adds computational cost to resist brute-force attacks.
+API_KEY_HASH_SALT = b"api-key-hash-salt-v1"
+API_KEY_HASH_ITERATIONS = 100_000
+API_KEY_HASH_LENGTH = 32  # 32 bytes = 256-bit derived key
 
+
 def _utcnow() -> datetime:
     return datetime.now(timezone.utc)
 
 
 def hash_api_key(api_key: str) -> str:
     """
-    Return a deterministic SHA-256 hash for an API key.
+    Return a deterministic, computationally expensive hash for an API key.
     """
-    return hashlib.sha256(api_key.encode("utf-8")).hexdigest()
+    dk = hashlib.pbkdf2_hmac(
+        "sha256",
+        api_key.encode("utf-8"),
+        API_KEY_HASH_SALT,
+        API_KEY_HASH_ITERATIONS,
+        dklen=API_KEY_HASH_LENGTH,
+    )
+    return dk.hex()
 
 
 def generate_api_key_value() -> str:
EOF
@@ -20,16 +20,28 @@
API_KEY_PREFIX = "sk_"
KEY_PREFIX_LENGTH = 8

# Parameters for API key hashing. PBKDF2 adds computational cost to resist brute-force attacks.
API_KEY_HASH_SALT = b"api-key-hash-salt-v1"
API_KEY_HASH_ITERATIONS = 100_000
API_KEY_HASH_LENGTH = 32 # 32 bytes = 256-bit derived key


def _utcnow() -> datetime:
return datetime.now(timezone.utc)


def hash_api_key(api_key: str) -> str:
"""
Return a deterministic SHA-256 hash for an API key.
Return a deterministic, computationally expensive hash for an API key.
"""
return hashlib.sha256(api_key.encode("utf-8")).hexdigest()
dk = hashlib.pbkdf2_hmac(
"sha256",
api_key.encode("utf-8"),
API_KEY_HASH_SALT,
API_KEY_HASH_ITERATIONS,
dklen=API_KEY_HASH_LENGTH,
)
return dk.hex()


def generate_api_key_value() -> str:
Copilot is powered by AI and may make mistakes. Always verify output.
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 14, 2026

Greptile Summary

This PR merges the saas branch into main, adding a full SaaS infrastructure layer: database models with Alembic migrations, Stripe payments, WorkOS authentication, agent chat with streaming, Telegram integration, and Railway deployment support. The merge strategy is well-considered — main's maintained files (CI, docs, tests) are preserved while SaaS-specific additions are brought in. All SaaS config fields are optional, so non-SaaS usage remains unaffected.

  • Webhook handlers have crash risk: datetime.fromtimestamp() is called on values from .get() which can return None, causing TypeError at runtime across multiple webhook handlers
  • Payment failure handling is too aggressive: Users are immediately downgraded to free tier on the first invoice.payment_failed event, before Stripe's retry cycle completes
  • Auth bypass heuristics are broad: Test mode detection ORs together three checks including "pytest" in sys.modules, which has no production guard and could be triggered by transitive test dependencies
  • Webhook secret fallback weakens rotation: The dual-secret verification approach means old secrets remain valid after rotation
  • Inconsistent enum usage in webhooks: String literals ("plus_tier", "free") used instead of SubscriptionTier enum, diverging from checkout.py which uses the enum properly

Confidence Score: 2/5

  • The PR introduces solid SaaS infrastructure but the payment webhook handlers have runtime crash risks and overly aggressive failure handling that should be fixed before merging.
  • Score of 2 reflects that while the overall architecture and merge strategy are sound, the webhook handlers contain a datetime.fromtimestamp(None) crash path that will cause 500 errors in production, the immediate payment failure downgrade will incorrectly revoke user access during Stripe's retry cycle, and the auth bypass detection is broader than necessary.
  • Pay close attention to src/api/routes/payments/webhooks.py (crash risk from None timestamps, aggressive downgrade, string literals vs enums) and src/api/auth/workos_auth.py (test mode bypass heuristics).

Important Files Changed

Filename Overview
src/api/routes/payments/webhooks.py Multiple issues: datetime.fromtimestamp(None) crash risk from .get() calls, immediate downgrade on first payment failure, string literals instead of enums, exception details leaked in response, dual webhook secret weakens rotation security.
src/api/auth/workos_auth.py Test mode auth bypass uses multiple heuristics (pytest in sys.modules, DEV_ENV, sys.argv) ORed together — the pytest check has no production guard. Otherwise well-structured JWT verification with JWKS.
src/api/routes/payments/checkout.py Syncs local DB state when user already has active subscription before returning 400 (intentional). Uses unvalidated Origin header for redirect URLs. Overall logic is sound.
src/api/routes/payments/metering.py Metered billing implementation with usage reporting to Stripe. UsageAction.SET could allow usage manipulation but likely restricted to server-side calls only.
src/api/routes/payments/subscription.py Subscription status endpoint with Stripe API integration and DB fallback. Complex but functional. Potential staleness between Stripe and DB state.
src/api/routes/agent/agent.py Large agent chat endpoint with streaming support. Broad exception handling returns 200 with error message instead of proper HTTP status. Manual DB session closure before streaming is a notable pattern.
common/global_config.py Configuration loader merging main and SaaS config. All SaaS fields optional, non-SaaS usage unaffected. Clean integration of both configuration systems.
src/db/database.py Database session management with proper generator-based dependency injection for FastAPI. Handles rollback on exceptions correctly.
utils/llm/dspy_inference.py Extended inference handler with streaming support and lazy initialization. The async generator for streaming is correctly structured.
utils/llm/dspy_langfuse.py LangFuse callback with explicit trace context support. ContextVar usage is correct (.get(None) provides default). Well-integrated tool call ID matching.
src/server.py FastAPI server setup with session middleware and router inclusion. CORS uses wildcard methods which is permissive but acceptable for an API server.
src/db/utils/db_transaction.py Transaction context manager with timeout support via SIGALRM (Unix-only, main-thread-only). Timeouts silently disabled in other contexts.
pyproject.toml Merged dependency lists from main and saas branches. SaaS-specific deps (stripe, workos, etc.) added alongside existing ones.
Makefile Added DB migration targets alongside existing make targets. Clean integration.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/api/routes/payments/webhooks.py
Line: 96-101

Comment:
**`datetime.fromtimestamp(None)` will crash**

`invoice.get("period_start")` and `invoice.get("period_end")` can return `None` if the keys are missing from the Stripe invoice payload. Passing `None` to `datetime.fromtimestamp()` raises a `TypeError` at runtime. The same pattern repeats at lines 193-199 and 212-218 with `subscription_data.get("current_period_start")` / `subscription_data.get("current_period_end")`.

Consider using bracket access (`invoice["period_start"]`) to get a clear `KeyError` pointing at the root cause, or guard with an explicit `None` check before converting.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/api/routes/payments/webhooks.py
Line: 247-265

Comment:
**Immediate downgrade on first payment failure**

On any `invoice.payment_failed` event, the user is immediately downgraded to the free tier. Stripe typically retries failed payments multiple times (configurable in Stripe dashboard), and each retry emits another `invoice.payment_failed` event. Downgrading on the first failure means users lose access before Stripe has finished its retry cycle.

Consider checking the invoice's `attempt_count` or `next_payment_attempt` field and only downgrading after all retries are exhausted (i.e., `next_payment_attempt` is `None`).

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/api/auth/workos_auth.py
Line: 137-149

Comment:
**Test mode auth bypass relies on multiple heuristics**

The test mode detection combines three checks (`is_pytest`, `is_dev_env_test`, `is_script_test`) where any one being `True` disables JWT signature verification entirely. While the `sys.argv` check is gated behind `DEV_ENV != "prod"`, the `"pytest" in sys.modules` check has no such guard — if `pytest` is accidentally in `sys.modules` in a production deployment (e.g., a test dependency pulled transitively), authentication is bypassed.

Consider making the bypass strictly depend on a single explicit signal (e.g., only `DEV_ENV == "test"`) rather than ORing multiple heuristics together.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/api/routes/payments/webhooks.py
Line: 191

Comment:
**String literal instead of `SubscriptionTier` enum**

This uses the string literal `"plus_tier"` while `checkout.py` uses `SubscriptionTier.PLUS.value`. Both currently resolve to the same string, but if the enum value changes, this file will silently diverge. The same issue occurs at lines 210, 241, and 265.

```suggestion
                        subscription.subscription_tier = SubscriptionTier.PLUS.value
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/api/routes/payments/webhooks.py
Line: 22-55

Comment:
**Fallback to secondary webhook secret weakens secret rotation**

When verification with the primary secret fails, the code falls back to the secondary secret. This means that after a secret rotation, an attacker who obtained the old secret can still send valid webhook events until the old secret is fully removed from config. The comment says this "helps when env vars are swapped," but this could be addressed with proper deployment configuration rather than accepting two secrets at once.

If the dual-secret approach is intentional for zero-downtime rotations, consider at minimum logging a warning when the secondary secret is used so operators know the primary needs fixing.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/api/routes/payments/webhooks.py
Line: 275-277

Comment:
**Exception detail exposed in webhook error response**

`str(e)` is passed directly as the HTTP response detail, which can leak internal implementation details (stack traces, DB connection strings, etc.) to whoever sends the webhook request.

```suggestion
        logger.error(f"Error processing subscription webhook: {str(e)}")
        raise HTTPException(status_code=500, detail="Internal server error")
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: fbc977a

Comment on lines +96 to +101
subscription.billing_period_start = datetime.fromtimestamp(
invoice.get("period_start"), tz=timezone.utc
)
subscription.billing_period_end = datetime.fromtimestamp(
invoice.get("period_end"), tz=timezone.utc
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

datetime.fromtimestamp(None) will crash

invoice.get("period_start") and invoice.get("period_end") can return None if the keys are missing from the Stripe invoice payload. Passing None to datetime.fromtimestamp() raises a TypeError at runtime. The same pattern repeats at lines 193-199 and 212-218 with subscription_data.get("current_period_start") / subscription_data.get("current_period_end").

Consider using bracket access (invoice["period_start"]) to get a clear KeyError pointing at the root cause, or guard with an explicit None check before converting.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/api/routes/payments/webhooks.py
Line: 96-101

Comment:
**`datetime.fromtimestamp(None)` will crash**

`invoice.get("period_start")` and `invoice.get("period_end")` can return `None` if the keys are missing from the Stripe invoice payload. Passing `None` to `datetime.fromtimestamp()` raises a `TypeError` at runtime. The same pattern repeats at lines 193-199 and 212-218 with `subscription_data.get("current_period_start")` / `subscription_data.get("current_period_end")`.

Consider using bracket access (`invoice["period_start"]`) to get a clear `KeyError` pointing at the root cause, or guard with an explicit `None` check before converting.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +247 to +265
elif event_type == "invoice.payment_failed":
# Handle payment failure -> auto-downgrade
invoice_obj = event["data"]["object"]
invoice_subscription_id = invoice_obj.get("subscription")

if invoice_subscription_id:
subscription = (
db.query(UserSubscriptions)
.filter(
UserSubscriptions.stripe_subscription_id
== invoice_subscription_id
)
.first()
)

if subscription:
with db_transaction(db):
subscription.is_active = False
subscription.subscription_tier = "free"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Immediate downgrade on first payment failure

On any invoice.payment_failed event, the user is immediately downgraded to the free tier. Stripe typically retries failed payments multiple times (configurable in Stripe dashboard), and each retry emits another invoice.payment_failed event. Downgrading on the first failure means users lose access before Stripe has finished its retry cycle.

Consider checking the invoice's attempt_count or next_payment_attempt field and only downgrading after all retries are exhausted (i.e., next_payment_attempt is None).

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/api/routes/payments/webhooks.py
Line: 247-265

Comment:
**Immediate downgrade on first payment failure**

On any `invoice.payment_failed` event, the user is immediately downgraded to the free tier. Stripe typically retries failed payments multiple times (configurable in Stripe dashboard), and each retry emits another `invoice.payment_failed` event. Downgrading on the first failure means users lose access before Stripe has finished its retry cycle.

Consider checking the invoice's `attempt_count` or `next_payment_attempt` field and only downgrading after all retries are exhausted (i.e., `next_payment_attempt` is `None`).

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +137 to +149
# Check if we're in test mode (skip signature verification for tests)
# Detect test mode by checking if pytest is running or if DEV_ENV is explicitly set to "test"
# We also check for 'test' in sys.argv[0] ONLY if we are NOT in production, to avoid security risks
# where a script named "test_something.py" could bypass auth in prod.
is_pytest = "pytest" in sys.modules
is_dev_env_test = global_config.DEV_ENV.lower() == "test"

# Only check sys.argv if we are definitely not in prod
is_script_test = False
if global_config.DEV_ENV.lower() != "prod":
is_script_test = "test" in sys.argv[0].lower()

is_test_mode = is_pytest or is_dev_env_test or is_script_test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test mode auth bypass relies on multiple heuristics

The test mode detection combines three checks (is_pytest, is_dev_env_test, is_script_test) where any one being True disables JWT signature verification entirely. While the sys.argv check is gated behind DEV_ENV != "prod", the "pytest" in sys.modules check has no such guard — if pytest is accidentally in sys.modules in a production deployment (e.g., a test dependency pulled transitively), authentication is bypassed.

Consider making the bypass strictly depend on a single explicit signal (e.g., only DEV_ENV == "test") rather than ORing multiple heuristics together.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/api/auth/workos_auth.py
Line: 137-149

Comment:
**Test mode auth bypass relies on multiple heuristics**

The test mode detection combines three checks (`is_pytest`, `is_dev_env_test`, `is_script_test`) where any one being `True` disables JWT signature verification entirely. While the `sys.argv` check is gated behind `DEV_ENV != "prod"`, the `"pytest" in sys.modules` check has no such guard — if `pytest` is accidentally in `sys.modules` in a production deployment (e.g., a test dependency pulled transitively), authentication is bypassed.

Consider making the bypass strictly depend on a single explicit signal (e.g., only `DEV_ENV == "test"`) rather than ORing multiple heuristics together.

How can I resolve this? If you propose a fix, please make it concise.

subscription.stripe_subscription_id = subscription_id
subscription.stripe_subscription_item_id = subscription_item_id
subscription.is_active = True
subscription.subscription_tier = "plus_tier"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String literal instead of SubscriptionTier enum

This uses the string literal "plus_tier" while checkout.py uses SubscriptionTier.PLUS.value. Both currently resolve to the same string, but if the enum value changes, this file will silently diverge. The same issue occurs at lines 210, 241, and 265.

Suggested change
subscription.subscription_tier = "plus_tier"
subscription.subscription_tier = SubscriptionTier.PLUS.value
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/api/routes/payments/webhooks.py
Line: 191

Comment:
**String literal instead of `SubscriptionTier` enum**

This uses the string literal `"plus_tier"` while `checkout.py` uses `SubscriptionTier.PLUS.value`. Both currently resolve to the same string, but if the enum value changes, this file will silently diverge. The same issue occurs at lines 210, 241, and 265.

```suggestion
                        subscription.subscription_tier = SubscriptionTier.PLUS.value
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +22 to +55
def _try_construct_event(payload: bytes, sig_header: str | None) -> dict:
"""
Verify and construct the Stripe event using available secrets.

Uses the environment-appropriate secret first, then falls back to the
alternate secret if verification fails (helps when env vars are swapped).
"""

def _secrets() -> Iterable[str]:
primary = (
global_config.STRIPE_WEBHOOK_SECRET
if global_config.DEV_ENV == "prod"
else global_config.STRIPE_TEST_WEBHOOK_SECRET
)
secondary = (
global_config.STRIPE_TEST_WEBHOOK_SECRET
if global_config.DEV_ENV == "prod"
else global_config.STRIPE_WEBHOOK_SECRET
)
if primary:
yield primary
if secondary and secondary != primary:
yield secondary

if not sig_header:
raise HTTPException(status_code=400, detail="Missing stripe-signature header")

last_error: Exception | None = None
for secret in _secrets():
try:
return stripe.Webhook.construct_event(payload, sig_header, secret)
except Exception as exc: # noqa: B902
last_error = exc
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fallback to secondary webhook secret weakens secret rotation

When verification with the primary secret fails, the code falls back to the secondary secret. This means that after a secret rotation, an attacker who obtained the old secret can still send valid webhook events until the old secret is fully removed from config. The comment says this "helps when env vars are swapped," but this could be addressed with proper deployment configuration rather than accepting two secrets at once.

If the dual-secret approach is intentional for zero-downtime rotations, consider at minimum logging a warning when the secondary secret is used so operators know the primary needs fixing.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/api/routes/payments/webhooks.py
Line: 22-55

Comment:
**Fallback to secondary webhook secret weakens secret rotation**

When verification with the primary secret fails, the code falls back to the secondary secret. This means that after a secret rotation, an attacker who obtained the old secret can still send valid webhook events until the old secret is fully removed from config. The comment says this "helps when env vars are swapped," but this could be addressed with proper deployment configuration rather than accepting two secrets at once.

If the dual-secret approach is intentional for zero-downtime rotations, consider at minimum logging a warning when the secondary secret is used so operators know the primary needs fixing.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +275 to +277
except Exception as e:
logger.error(f"Error processing subscription webhook: {str(e)}")
raise HTTPException(status_code=500, detail=str(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception detail exposed in webhook error response

str(e) is passed directly as the HTTP response detail, which can leak internal implementation details (stack traces, DB connection strings, etc.) to whoever sends the webhook request.

Suggested change
except Exception as e:
logger.error(f"Error processing subscription webhook: {str(e)}")
raise HTTPException(status_code=500, detail=str(e))
logger.error(f"Error processing subscription webhook: {str(e)}")
raise HTTPException(status_code=500, detail="Internal server error")
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/api/routes/payments/webhooks.py
Line: 275-277

Comment:
**Exception detail exposed in webhook error response**

`str(e)` is passed directly as the HTTP response detail, which can leak internal implementation details (stack traces, DB connection strings, etc.) to whoever sends the webhook request.

```suggestion
        logger.error(f"Error processing subscription webhook: {str(e)}")
        raise HTTPException(status_code=500, detail="Internal server error")
```

How can I resolve this? If you propose a fix, please make it concise.

@Miyamura80 Miyamura80 closed this Mar 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants