Sitemap - 2025 - Grok Mountain’s Substack
Nvidia GPUs: Powering the Future of AI with the H100 and the upcoming B200
Trump's Bargaining Chip: How AI GPUs Could Lead to Peace with Putin
Why Grok 10 Will Rely on Reinforcement Learning for 90% of Its Training
The Singularity Unveiled: A Leap Beyond Humanity and the Chips That Could Take Us There
Inside the Colossus: How NVIDIA and xAI Teamed Up to Build a Supercomputer for Grok 3
The AI Arms Race: U.S., China, and the Path to Grok 3
Origin of Grok 3 - the Colossus Data Center in Memphis
Understanding the Context Window: A Train Station Analogy
An Android's Journey Through University: Understanding Neural Networks and the Transformer Block
The Emergence of Intelligence in LLMs via Complex Interactions of Weights
Introduction to Neurons in Transformers
Transformers and Holography: How AI Models Capture the 'Whole in Every Part'
Deciphering the Language of the Ancients: How Self-Attention Works in LLMs
Unveiling the Magic of Self-Attention: A Journey Through the Gatekeepers to the Mirror of Awareness
The Profound Role of 6,144 in Grok-1 - A Deep Dive into Neurons, Weights, and Dimensions
Illuminating AI: How Grok-1 Uses 'Prismatic' Layers to Understand the Universe
DeepSeek's Chain of Thought: Rescuing Astronauts from Space
DeepSeek: The Disruptive Force in AI with Its Cost-Efficient "Mini-Me" Model
DeepSeek's AI Dojo: Harnessing the Power of Reinforcement Learning
DeepSeek's Recipe for AI Success: A Masterclass in Efficiency and Innovation
The Detective’s Dilemma: Solving the Murder Mystery with Transformers
Unlocking the Secrets of Embeddings and Transformer Blocks: The Key and the Castle
The Journey of the Hidden State in LLMs: From Horse-Drawn Carriages to Space Shuttles
The Profound Inquiry: A Monk's Question to Buddha
Drawing AI: The Magic of Dynamic Computational Graphs
The Symphony of AI: Understanding PyTorch and TensorFlow through the Lens of an Orchestra
Backpropagation Explained: The Art of Perfecting Recipes
Understanding Logits and Softmax: The Heart of AI's Prediction Magic
Decoding LLM Training: A Test-Taking Analogy
From Plateau to Panorama: Sculpting the AI Landscape
Navigating the Maze: How Loss Functions Guide AI Models Like Grok to Excellence
Learning Rate Decay: The Misnomer
Understanding Gradients in AI Model Training: The Grok Example
Understanding Convergence in Training AI Models Like Grok
What's an Epoch in AI Model Training? A Simple Explanation for Grok Model Training
Understanding the Core Concepts of Grok Model Training and Tuning
Understanding the "Temperature" in Grok-1's Routing Strategy
Navigating the Maze: Training the Open Source Grok-1 with its Intricate Routing Mechanism
Microsoft's Phi-4 Goes Open Source: A Deep Dive and Comparison with xAI's Grok
Harnessing Real-Time Insights: How Grok's Integration with X and Qdrant Sets It Apart
Enhancing AI with Knowledge Graphs: How Grok Surpasses ChatGPT
Unraveling the Synergy: BPE and RoPE in Grok's Input Layer
Decoding Embeddings: The Hidden Language of AI in NLP
Exploring Byte Pair Encoding (BPE) with Grok: The Art of Tokenization
Understanding Grok's Mixture-of-Experts: A Symphony of Specialized Knowledge
A Deep Dive into the Architecture of Grok: Unveiling the Next Generation of AI
Democratizing AI: The Open Source Quest to Quantize and Localize Grok
Grok's Journey Through Mathematical and Logical Landscapes: Comparing with Other LLMs
Grok Models vs. The Giants: Performance on Leading AI Benchmarks
The Evolution of Grok: Unpacking xAI's AI Models from Grok-0 to Grok-2 and Beyond
Real-World Business Applications of Grok Model Training

