Blog
Machine Learning and Recommender Systems
-
Training MoE Right: Making Every Expert Count
The techniques that prevent expert collapse in Mixture-of-Experts LLMs: load balancing losses, routing strategies, shared experts, auxiliary-loss-free methods, and fine-grained expert segmentation.
-
Feel the AGI: Supervised Fine-Tuning in Your Browser
Fine-tune a 14M parameter language model in your browser. Load Pythia-14M, train on instruction-completion pairs, and watch it learn.
-
Scaling Laws for LLMs: The 3 Knobs You Actually Have
Scaling is not "bigger model." It's a budget allocation problem across parameters, tokens, and compute. A first-principles guide to what scaling laws actually say.
-
WTF Is Happening Inside a Transformer (Linear Algebra Edition)
An intuition-first guide to what transformers actually compute. Q, K, V demystified, attention as a data-dependent mixing matrix, and the MLP's expand-activate-compress pattern.
-
DeepSeek's Technical Playbook: From MLA to Conditional Memory
A deep dive into DeepSeek's key innovations: Multi-head Latent Attention, sparse MoE, sparse attention, scalable RL, and the Engram conditional memory architecture.
-
Reward Modeling and DPO: Learning What "Good" Means
How reward models turn human preferences into training signal, and how DPO skips the reward model entirely. Bradley-Terry, preference data, and offline alignment explained.
-
Positional Encodings for LLMs: From Sinusoidal to RoPE
How transformers understand token order. An intuition-first guide covering sinusoidal positional encodings and Rotary Position Embeddings (RoPE).
-
DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference
How DeepSeek uses idle decode-side NICs to double KV-Cache loading throughput in prefill-decode disaggregated serving.
-
Reinforcement Learning for LLMs
An intuition-first guide to the RL concepts behind RLHF, PPO, and GRPO — the background you need before diving into alignment algorithms.
-
PPO & GRPO for LLM Alignment
A first-principles guide to PPO and GRPO for LLM alignment, for ML engineers with minimal RL background.
-
Hashing for large scale similarity
Machine Learning
-
Implementing Matrix Factorisation using Tensorflow
My quora response
-
How exactly is machine learning used in recommendation engines?
My quora response