TurboQuant Animated: Watch Vector Quantization Happen
Interactive 2D and 3D animations showing every step of TurboQuant: normalize, rotate, quantize, reconstruct. Add your own points and see the compression error in real time.
TurboQuant compresses vectors by rotating them into a predictable distribution, then snapping to a precomputed grid. Step through the animation below to see every stage. Hover any point for exact coordinates. On step 0, click to add or remove points.
Arrow keys or spacebar to step. Try different bit-widths to see accuracy change. In 3D, drag to orbit.
How TurboQuant Works
The goal: compress a vector (a list of numbers) using very few bits per number, then reconstruct it with minimal error. TurboQuant does this without ever looking at the data distribution.
Store the vector's length (norm) as a single number. Divide the vector by its length so it sits on the unit circle (2D) or unit sphere (3D+). This separates "how long" from "which direction."
Multiply by a fixed random rotation matrix (generated once from a seed). This scrambles the coordinates so that every vector, regardless of its original shape, has the same predictable coordinate distribution. The rotation is shared between encoder and decoder, so it costs nothing to transmit.
Since we know the exact distribution of each coordinate (thanks to the rotation), we precompute the optimal grid placement for that distribution. At b bits, each coordinate is rounded to one of 2b values and stored as a small index. This is the only lossy step.
One b-bit index per coordinate + one float for the norm. At 3 bits with 128 dimensions: 128 x 3 = 384 bits + 16 bits for norm = 50 bytes, down from 256 bytes (16-bit floats). A 5x compression.
To reconstruct: look up the grid values from the stored indices, then multiply by the transpose of the rotation matrix (which undoes the rotation exactly). The result is an approximation of the original unit vector.
Scale the unit-length approximation by the stored norm to restore the original magnitude. The result is the reconstructed vector. The only error comes from step 3 (the grid snapping).
Intuition: The random rotation turns any input vector into one with a known, predictable coordinate distribution. This means a single precomputed grid (designed for that distribution) is near-optimal for every possible input. No training data needed, no codebook to learn. The grid is fixed and universal. The error is provably within 2.72x of the best any method could achieve, even one with perfect knowledge of the data.
For the full technical explanation with equations, proofs, and PyTorch pseudocode, see the companion post: TurboQuant: Near-Optimal Vector Quantization Without Looking at Your Data.
Paper: TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate (Zandieh, Daliri, Hadian, Mirrokni, 2025).