Skip to content

Conversation

@kamath
Copy link
Contributor

@kamath kamath commented Jan 12, 2026

Summary

Eliminates the sharp native dependency by replacing it with browser-native alternatives for image processing. The screenshot tool now uses PNG header parsing for dimension extraction and OffscreenCanvas with createImageBitmap for GPU-accelerated resizing, working with raw binary data to bypass CSP restrictions.

Changes

  • PNG parsing: Extract image dimensions from PNG IHDR chunk (no dependency needed)
  • Canvas resize: Use OffscreenCanvas and createImageBitmap for GPU-accelerated resizing
  • CSP-safe: Binary data operations bypass img-src CSP restrictions
  • Dimensions in output: Screenshot response now includes final image dimensions

Testing

  • Build succeeds with no TypeScript errors
  • Screenshot tool resizes images for Claude vision API constraints (1568px max edge, 1.15MP max)
  • Tested on Chromium (OffscreenCanvas supported since v69)

🤖 Generated with Claude Code

@kamath kamath marked this pull request as draft January 12, 2026 06:45
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 12, 2026

Greptile Overview

Greptile Summary

This PR removes the sharp native dependency and replaces it with browser-native image processing. The implementation adds custom PNG header parsing to extract dimensions and uses OffscreenCanvas with createImageBitmap for GPU-accelerated resizing.

Key changes:

  • PNG dimension extraction using Buffer operations on the IHDR chunk
  • Browser-based image resizing via page.evaluate() with OffscreenCanvas APIs
  • Binary data handling using atob/btoa for base64 conversions
  • Added dimension output in screenshot response text

Approach: The resizing now executes in the browser context rather than in Node.js, which aligns with the CSP-safe design goal. The PNG parsing correctly reads the first 24 bytes to extract width/height from the IHDR chunk structure.

Confidence Score: 4/5

  • Safe to merge with minor performance considerations for large screenshots
  • The implementation correctly replaces sharp with browser-native alternatives. PNG parsing logic is accurate (24 bytes for signature + IHDR dimensions). Binary data handling with atob/btoa and charCodeAt/fromCharCode follows JavaScript standards. The main consideration is that string concatenation in loops (lines 57-59, 81-84) is O(n) per iteration for building strings, which could be slower for very large screenshots (MB+), but will work correctly. The approach is architecturally sound and removes the problematic native dependency.
  • src/tools/screenshot.ts - while functionally correct, the string building approach could be optimized for better performance with large images

Important Files Changed

File Analysis

Filename Score Overview
package.json 5/5 Removed sharp dependency from dependencies list - straightforward removal
pnpm-lock.yaml 5/5 Lock file updated to reflect sharp removal and transitive dependency cleanup
src/tools/screenshot.ts 2/5 Replaced sharp with custom PNG parsing and browser-based resizing; contains critical logic bug in string concatenation for binary data

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant Handler as handleScreenshot
    participant CDP as Chrome DevTools Protocol
    participant Parser as parsePngDimensions
    participant Browser as resizeImageInBrowser
    participant Canvas as OffscreenCanvas

    Client->>Handler: screenshot request
    Handler->>CDP: Page.captureScreenshot (PNG, base64)
    CDP-->>Handler: base64 screenshot data
    Handler->>Parser: parse PNG dimensions
    Parser->>Parser: decode first 24 bytes
    Parser->>Parser: validate PNG signature
    Parser->>Parser: extract width/height from IHDR
    Parser-->>Handler: {width, height}
    Handler->>Handler: calculate shrink factor
    alt shrink < 1 (needs resize)
        Handler->>Browser: page.evaluate with base64 data
        Browser->>Browser: atob to decode base64
        Browser->>Browser: create Blob from bytes
        Browser->>Canvas: createImageBitmap from Blob
        Canvas-->>Browser: ImageBitmap
        Browser->>Canvas: OffscreenCanvas.drawImage
        Browser->>Canvas: convertToBlob (PNG)
        Browser->>Browser: btoa to encode as base64
        Browser-->>Handler: resized base64 data
    end
    Handler->>Handler: register screenshot resource
    Handler->>Client: return screenshot with dimensions
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +81 to +90
let binary = "";
for (let i = 0; i < resultBytes.length; i++) {
binary += String.fromCharCode(resultBytes[i]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P0] String concatenation for binary data will corrupt the image. In JavaScript, strings are UTF-16 encoded and string concatenation with String.fromCharCode() can mangle bytes > 127. Large images (hundreds of KB) will trigger this, resulting in corrupted/invalid PNG data returned to the caller.

Use TextDecoder and proper binary handling instead:

Suggested change
let binary = "";
for (let i = 0; i < resultBytes.length; i++) {
binary += String.fromCharCode(resultBytes[i]);
}
const resultBytes = new Uint8Array(arrayBuffer);
return btoa(String.fromCharCode(...resultBytes));
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/tools/screenshot.ts
Line: 81:84

Comment:
[P0] String concatenation for binary data will corrupt the image. In JavaScript, strings are UTF-16 encoded and string concatenation with `String.fromCharCode()` can mangle bytes > 127. Large images (hundreds of KB) will trigger this, resulting in corrupted/invalid PNG data returned to the caller.

Use `TextDecoder` and proper binary handling instead:
```suggestion
      const resultBytes = new Uint8Array(arrayBuffer);
      return btoa(String.fromCharCode(...resultBytes));
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +55 to +65
const binaryString = atob(data);
const bytes = new Uint8Array(binaryString.length);
for (let i = 0; i < binaryString.length; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P0] Same binary corruption issue. String concatenation in a loop will corrupt bytes > 127 when decoding base64. For screenshots that are several hundred KB or more, this loop builds corrupted binary data that will fail bitmap creation or produce incorrect images.

Fix with spread operator:

Suggested change
const binaryString = atob(data);
const bytes = new Uint8Array(binaryString.length);
for (let i = 0; i < binaryString.length; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
const binaryString = atob(data);
const bytes = new Uint8Array(binaryString.length);
for (let i = 0; i < binaryString.length; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/tools/screenshot.ts
Line: 55:59

Comment:
[P0] Same binary corruption issue. String concatenation in a loop will corrupt bytes > 127 when decoding base64. For screenshots that are several hundred KB or more, this loop builds corrupted binary data that will fail bitmap creation or produce incorrect images.

Fix with spread operator:
```suggestion
      const binaryString = atob(data);
      const bytes = new Uint8Array(binaryString.length);
      for (let i = 0; i < binaryString.length; i++) {
        bytes[i] = binaryString.charCodeAt(i);
      }
```

How can I resolve this? If you propose a fix, please make it concise.

kamath and others added 7 commits January 16, 2026 17:10
Replaces the native sharp dependency with PNG header parsing and OffscreenCanvas-based image resizing. This approach:
- Uses PNG IHDR chunk parsing for dimension extraction (no dependencies)
- Leverages OffscreenCanvas and createImageBitmap for GPU-accelerated resizing
- Bypasses CSP restrictions by working with raw binary data instead of URLs
- Returns screenshot dimensions in the MCP response

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Replace Buffer.from/readUInt32BE with atob/Uint8Array/DataView in screenshot.ts
- Replace fs.readFileSync with ESM JSON import in program.ts
- Remove fs, path, and fileURLToPath imports

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Create src/local.ts that imports dotenv and re-exports from index
- Remove dotenv import from src/index.ts (Smithery entry point)
- Update src/program.ts to import from local.js for CLI usage

This keeps src/index.ts clean for Smithery's serverless environment
while preserving dotenv functionality for local development.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Remove dotenv from production dependencies
- Add @dotenvx/dotenvx as dev dependency for local development
- Delete src/local.ts (no longer needed)
- Update program.ts to import directly from index.js

This eliminates dotenv from the bundle entirely, making the codebase
fully V8-compatible for Smithery deployment.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@kamath kamath force-pushed the kamath/remove-sharp-dep branch from 85667aa to c1f6df8 Compare January 16, 2026 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant