Navigate
Home How it works
What's included Pricing FAQ
Back to pricing
Individual wheel

Flash Attention
for Google Colab

The most painful wheel to compile, now prebuilt and ready to install. Get both flash-attn 2.7.3 and 2.8.3 — guaranteed to work out of the box on A100 and L4 with Colab's current CUDA stack.

$5
One-time purchase
Both versions included
v2.7.3 + v2.8.3 A100 · L4 Instant token No compiling
🖼️ Try it with Z-Image Turbo notebook →
Flash Attention — Prebuilt for Colab A100 and L4
Pick the version you need.
Both flash-attn 2.7.3 and 2.8.3 are included with your purchase. Install whichever your project requires — or both.
v2.8.3

flash-attn 2.8.3

Latest stable release. Full Flash Attention 2 with improved kernel dispatch, better memory efficiency, and support for GQA and MQA patterns. Ideal for newer models like Z-Image, SDXL, and Flux.

A100 · SM80 L4 · SM89
v2.7.3

flash-attn 2.7.3

Widely-used stable release. Compatible with the broadest range of models and diffusers versions. Perfect for TRELLIS.2, Stable Diffusion, and workflows that pin to 2.7.x.

A100 · SM80 L4 · SM89
One line. Sixty seconds.
After purchase, you'll receive a personal token. Use it to install directly in your Colab notebook.

Install v2.8.3 (latest)

# Set your token
import os
os.environ['MISSING_LINK_TOKEN'] = "ml_YOUR_TOKEN"
TOKEN = os.environ['MISSING_LINK_TOKEN']

# Install flash-attn 2.8.3
!pip install --no-deps "https://{TOKEN}@missinglink.build/wheel/flash_attn-2.8.3-cp312-cp312-linux_x86_64.whl"

Install v2.7.3

# Set your token
import os
os.environ['MISSING_LINK_TOKEN'] = "ml_YOUR_TOKEN"
TOKEN = os.environ['MISSING_LINK_TOKEN']

# Install flash-attn 2.7.3
!pip install --no-deps "https://{TOKEN}@missinglink.build/wheel/flash_attn-2.7.3-cp312-cp312-linux_x86_64.whl"
SpecGuaranteed
GPUA100 L4
PlatformGoogle Colab linux x86_64
Python3.12
CUDA12.8
PyTorch2.10
🖼️ Featured notebook

Try it with Z-Image Turbo

State-of-the-art text-to-image generation. Flash Attention 2.8.3 powers the efficient inference — generate 1024×1024 images in seconds on an L4.

🖼️ Text → Image
~5s on L4
🔬 Z-Image
Open Z-Image Turbo in Colab Buy Flash Attention — $5
Why not just compile it yourself?
You can try. But flash-attn is notoriously one of the hardest CUDA packages to build from source on Colab.

Compiling from source

30–90 minutes of GPU time wasted per build. Requires matching exact CUDA toolkit, torch version, and gcc. Frequently fails with cryptic errors. Each new Colab session starts over.

MissingLink wheel

Installs in under 60 seconds. Prebuilt against Colab's exact CUDA 12.8 + PyTorch 2.10 stack. Works every time. One pip install, zero config.

Skip the build.
Ship faster.

Flash Attention 2.7.3 and 2.8.3, prebuilt for every Colab GPU. Five dollars. Zero compiling.

One-time payment via Stripe · No account required · Instant token delivery