A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/Pallas/JAX).
-
Updated
Mar 4, 2025 - Python
A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/Pallas/JAX).
A FlashAttention backwards-over-backwards ⚡🔙🔙
Calculate the hash of any input for ZK-Friendly hashes (MiMC & Poseidon) over a variety of Elliptic Curves.
Efficient transformer architecture with Packet-Switched Attention (PSA). Protocol C "The Blink Protocol" achieves 6-25x speedup via 2-bit semantic routing—models learn to ignore structural noise. Includes legacy 1-2 bit activation compression for long context on commodity GPUs/TPUs. Built for resource-constrained researchers. Apache 2.0
Repo to hold core components when building a Pallas Systems Website
Benchmarking the JAX Pallas implementation of a custom RNN against alternatives
SuperNova (Pasta) proof generator & verifier with CI and frozen fixtures
Add a description, image, and links to the pallas topic page so that developers can more easily learn about it.
To associate your repository with the pallas topic, visit your repo's landing page and select "manage topics."