Sitemap - 2025 - Grok Mountain’s Substack

Nvidia GPUs: Powering the Future of AI with the H100 and the upcoming B200

Trump's Bargaining Chip: How AI GPUs Could Lead to Peace with Putin

Why Grok 10 Will Rely on Reinforcement Learning for 90% of Its Training

The Singularity Unveiled: A Leap Beyond Humanity and the Chips That Could Take Us There

Inside the Colossus: How NVIDIA and xAI Teamed Up to Build a Supercomputer for Grok 3

The AI Arms Race: U.S., China, and the Path to Grok 3

Origin of Grok 3 - the Colossus Data Center in Memphis

Understanding the Context Window: A Train Station Analogy

An Android's Journey Through University: Understanding Neural Networks and the Transformer Block

The Emergence of Intelligence in LLMs via Complex Interactions of Weights

Introduction to Neurons in Transformers

Transformers and Holography: How AI Models Capture the 'Whole in Every Part'

Deciphering the Language of the Ancients: How Self-Attention Works in LLMs

Unveiling the Magic of Self-Attention: A Journey Through the Gatekeepers to the Mirror of Awareness

The Profound Role of 6,144 in Grok-1 - A Deep Dive into Neurons, Weights, and Dimensions

Illuminating AI: How Grok-1 Uses 'Prismatic' Layers to Understand the Universe

Navigating the Neural Network Ant Hill: Understanding Supervised Learning, Reinforcement Learning, and Fine-Tuning

DeepSeek's Chain of Thought: Rescuing Astronauts from Space

DeepSeek: The Disruptive Force in AI with Its Cost-Efficient "Mini-Me" Model

DeepSeek's AI Dojo: Harnessing the Power of Reinforcement Learning

DeepSeek's Recipe for AI Success: A Masterclass in Efficiency and Innovation

The Detective’s Dilemma: Solving the Murder Mystery with Transformers

Unlocking the Secrets of Embeddings and Transformer Blocks: The Key and the Castle

The Journey of the Hidden State in LLMs: From Horse-Drawn Carriages to Space Shuttles

Decoding the Hidden State: How Grok Answers "What Would Happen If a Bear Chased a Shark Into the Ocean?"

The Profound Inquiry: A Monk's Question to Buddha

Drawing AI: The Magic of Dynamic Computational Graphs

The Symphony of AI: Understanding PyTorch and TensorFlow through the Lens of an Orchestra

Backpropagation Explained: The Art of Perfecting Recipes

Understanding Logits and Softmax: The Heart of AI's Prediction Magic

Decoding LLM Training: A Test-Taking Analogy

From Plateau to Panorama: Sculpting the AI Landscape

Navigating the Maze: How Loss Functions Guide AI Models Like Grok to Excellence

Learning Rate Decay: The Misnomer

Understanding Gradients in AI Model Training: The Grok Example

Understanding Convergence in Training AI Models Like Grok

What's an Epoch in AI Model Training? A Simple Explanation for Grok Model Training

Understanding the Core Concepts of Grok Model Training and Tuning

Understanding the "Temperature" in Grok-1's Routing Strategy

Navigating the Maze: Training the Open Source Grok-1 with its Intricate Routing Mechanism

Microsoft's Phi-4 Goes Open Source: A Deep Dive and Comparison with xAI's Grok

Harnessing Real-Time Insights: How Grok's Integration with X and Qdrant Sets It Apart

Enhancing AI with Knowledge Graphs: How Grok Surpasses ChatGPT

Unraveling the Synergy: BPE and RoPE in Grok's Input Layer

Decoding Embeddings: The Hidden Language of AI in NLP

Exploring Byte Pair Encoding (BPE) with Grok: The Art of Tokenization

Understanding Grok's Mixture-of-Experts: A Symphony of Specialized Knowledge

A Deep Dive into the Architecture of Grok: Unveiling the Next Generation of AI

Democratizing AI: The Open Source Quest to Quantize and Localize Grok

Grok's Journey Through Mathematical and Logical Landscapes: Comparing with Other LLMs

Grok Models vs. The Giants: Performance on Leading AI Benchmarks

The Evolution of Grok: Unpacking xAI's AI Models from Grok-0 to Grok-2 and Beyond

Real-World Business Applications of Grok Model Training

Exploring xAI's Grok: A Journey Through AI Innovation

Coming soon