Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 4 - Latent Space & Guidance // TRAIN BRAIN

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 4 - Latent Space & Guidance

Learn more details about this course: https://online.stanford.edu/courses/cme296-diffusion-and-large-vision-models
To follow along with the course schedule and syllabus, visit: https://cme296.stanford.edu/syllabus/
Chapters:
00:00:00 Introduction
00:07:05 Pixel space
00:12:39 Semantic vs perceptual similarity
00:14:27 Autoencoder
00:22:56 Variational autoencoders
00:31:19 ELBO derivation
00:46:43 Blurriness issue of VAEs
00:47:43 Reconstruction loss
00:48:54 KL regularization loss
00:50:17 Perceptual loss
00:54:17 Adversarial loss
00:57:01 Latent diffusion models
01:00:31 Encoder vs decoder trade-off
01:05:55 Text representation with Transformers
01:12:20 Image representation with ViT
01:18:35 Contrastive learning with CLIP
01:27:44 Classifier-based guidance
01:34:34 Classifier-free guidance
For more information about Stanford’s graduate programs, visit: https://online.stanford.edu/graduate-education
Afshine Amidi is an Adjunct Lecturer at Stanford University.
Shervine Amidi is an Adjunct Lecturer at Stanford University.
View the course playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNdy8rt2rZ4T2xM0OjADnfu

Stanford Online

You can gain access to a world of education through Stanford Online, the Stanford School of Engineering’s portal for academic and professional education offered by schools and units throughout Stanford University. https://online.stanford.edu/ Our robust ...

Stanford CS547 HCI Seminar | Spring 2026 | Just-in-Time Objectives for Specialized AI Interactions

Stanford CS547 HCI Seminar | Spring 2026 | Toward Ontological Multiplicity in AI and Computing

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Infrasctructure, Enterprise AI, SaaS

Live from Stanford AI Week

Stanford Robotics Seminar ENGR319 | Spring 2026 | Towards Trustworthy Autonomy

Our Learners share about their experience in the Engineering Leadership Program

Stanford Course - Technical Fundamentals of Generative AI

Course Overview - Technical Fundamentals of Generative AI

Stanford CS153 Frontier Systems | Building the Frontier Ecosystem

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Applications, Coding AI

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Building AI Factories

AI in Healthcare Series: Inside the Rise of AI in Healthcare, Open Evidence and Cyber Risks

Stanford CS153 Frontier Systems | Scale, AGI, and the Future of Everything

Stanford CS547 HCI Seminar | Spring 2026 | The Modern Motivators of Play

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Applications, Applied AI

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Guest Lecture: Dan Fu

Stanford Robotics Seminar ENGR319 | Spring 2026 | Leveraging Geometry in Robot Learning

Stanford CS25: Transformers United V6 I From Language Models to Native Multimodal Intelligence

Stanford CS25: Transformers United V6 I Serving Transformers: Lessons from the Trenches

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 17: Alignment - Multimodality

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 8 - Trending Topics

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 7 - Evaluation

Stanford CS153 Frontier Systems | The Road Ahead: Resilience Required

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 16: Post-Training - RLVR

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 15: Mid/Post-Training

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 14: Data

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Infrastructure, Capstone Case

Stanford CS25: Transformers United V6 I Advancing Science and Medicine with Collaborative AI Agents

Stanford CS153 Frontier Systems | The Discipline of Delivering Value per Gigawatt

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Enterprise Internal Knowledge

Stanford MS&E435 | Spring 2026 | Economics of Generative AI

Stanford Robotics Seminar ENGR319 | Spring 2026 | Integrated Learning and Planning

Stanford Robotics Seminar ENGR319 | Spring 2026 | Interactive Autonomy

Stanford CS25: Transformers United V6 I Distinct Modes of Generalization from Parameters and Context

Stanford CS153 Frontier Systems | The AI Native Company: How One Founder Becomes a 1000x Engineer

Stanford CS547 HCI Seminar | Spring 2026 | HCI and Human-Centered AI for Digital Health

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 13: Data (Sources, Datasets)

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 12: Evaluation

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 11: Scaling Laws

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 6 - Model Training

Stanford CS153 Frontier Systems | Jensen Huang from NVIDIA on the Compute Behind Intelligence

Stanford CS153 Frontier Systems | Scott Nolan from General Matter on Energy Bottlenecks

Stanford Robotics Seminar ENGR319 | Spring 2026 | Unlocking Autonomous Medical Robotics

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 10: Inference

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 5 - Architectures

Stanford CS25: Transformers United V6 I From Next-Token Prediction to Next-Generation Intelligence

Stanford CS25: Transformers United V6 I The Ultra-Scale Talk: Scaling Training to Thousands of GPUs

Stanford CS153 Frontier Systems | Ben Horowitz from a16z on Venture Capital Systems, Network Effects

Stanford CS153 Frontier Systems | Nikhyl Singhal from Skip on Product Management in the AI Era

Stanford CS153 Frontier Systems | Amit Jain from Luma AI on Unified Intelligence Systems

Stanford Online AI Programs Top Questions: When and How to Enroll in Online AI Courses

Stanford Online AI Programs Top Questions: Enrolling in Online Courses vs Self Study

Stanford Online AI Programs Top Questions: What's the Learning Experience Like?

Stanford Online AI Programs Top Questions: Ready to Start? Preparing for Success

Stanford Online AI Programs Top Questions: Choosing Your AI Program and Getting Started

Stanford Online AI Programs Top Questions: Graduate vs Professional - Which Is Right for You?

Stanford CS153 Frontier Systems | Andreas Blattmann from Black Forest Labs on Visual Intelligence

Stanford CS153 Frontier Systems | Mati Staniszewski from ElevenLabs on The Future of Voice Systems

Stanford's Code in Place Info Session with Mehran Sahami

Stanford CS153 Frontier Systems | Anjney Midha from AMP PBC on Frontier Systems

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 9: Scaling Laws

Stanford CS547 HCI Seminar | Spring 2026 | Observing the User Experience in 2026

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 8: Parallelism

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 7: Parallelism

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 6: Kernels, Triton, XLA

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 4 - Latent Space & Guidance

Stanford CS25: Transformers United V6 I On the Tradeoffs of State Space Models and Transformers

Stanford CS25: Transformers United V6 I From Representation Learning to World Modeling

Stanford CS25: Transformers United V6 I Overview of Transformers

Stanford Robotics Seminar ENGR319 | Spring 2026 | Mechanical Intelligence in Locomotion

Stanford Robotics Seminar ENGR319 | Spring 2026 | Robot Learning from Human Experience

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 5: GPUs, TPUs

Stanford Course - Web Security

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 3 - Flow matching

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 3: Architectures

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 2: PyTorch (einops)

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 1: Overview, Tokenization

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 2 - Score matching

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 1 - Diffusion

Stanford Robotics Seminar ENGR319 | Winter 2026 | Gen Control, Action Chunking, Moravec’s Paradox