
Chasing Real AGI: Inside ARC Prize 2025 with Chollet & Knoop
04/03/25 • 60 min
In this fascinating episode, we dive deep into the race towards true AI intelligence, AGI benchmarks, test-time adaptation, and program synthesis with star AI researcher (and philosopher) Francois Chollet, creator of Keras and the ARC AGI benchmark, and Mike Knoop, co-founder of Zapier and now co-founder with Francois of both the ARC Prize and the research lab Ndea. With the launch of ARC Prize 2025 and ARC-AGI 2, they explain why existing LLMs fall short on true intelligence tests, how new models like O3 mark a step change in capabilities, and what it will really take to reach AGI.
We cover everything from the technical evolution of ARC 1 to ARC 2, the shift toward test-time reasoning, and the role of program synthesis as a foundation for more general intelligence. The conversation also explores the philosophical underpinnings of intelligence, the structure of the ARC Prize, and the motivation behind launching Ndea — a ew AGI research lab that aims to build a "factory for rapid scientific advancement." Whether you're deep in the AI research trenches or just fascinated by where this is all headed, this episode offers clarity and inspiration.
Ndea
Website - https://ndea.com
X/Twitter - https://x.com/ndea
ARC Prize
Website - https://arcprize.org
X/Twitter - https://x.com/arcprize
François Chollet
LinkedIn - https://www.linkedin.com/in/fchollet
X/Twitter - https://x.com/fchollet
Mike Knoop
X/Twitter - https://x.com/mikeknoop
FIRSTMARK
Website - https://firstmark.com
X/Twitter - https://twitter.com/FirstMarkCap
Matt Turck (Managing Director)
LinkedIn - https://www.linkedin.com/in/turck/
X/Twitter - https://twitter.com/mattturck
(00:00) Intro
(01:05) Introduction to ARC Prize 2025 and ARC-AGI 2
(02:07) What is ARC and how it differs from other AI benchmarks
(02:54) Why current models struggle with fluid intelligence
(03:52) Shift from static LLMs to test-time adaptation
(04:19) What ARC measures vs. traditional benchmarks
(07:52) Limitations of brute-force scaling in LLMs
(13:31) Defining intelligence: adaptation and efficiency
(16:19) How O3 achieved a massive leap in ARC performance
(20:35) Speculation on O3's architecture and test-time search
(22:48) Program synthesis: what it is and why it matters
(28:28) Combining LLMs with search and synthesis techniques
(34:57) The ARC Prize structure: efficiency track, private vs. public
(42:03) Open source as a requirement for progress
(44:59) What's new in ARC-AGI 2 and human benchmark testing
(48:14) Capabilities ARC-AGI 2 is designed to test
(49:21) When will ARC-AGI 2 be saturated? AGI timelines
(52:25) Founding of NDEA and why now
(54:19) Vision beyond AGI: a factory for scientific advancement
(56:40) What NDEA is building and why it's different from LLM labs
(58:32) Hiring and remote-first culture at NDEA
(59:52) Closing thoughts and the future of AI research
In this fascinating episode, we dive deep into the race towards true AI intelligence, AGI benchmarks, test-time adaptation, and program synthesis with star AI researcher (and philosopher) Francois Chollet, creator of Keras and the ARC AGI benchmark, and Mike Knoop, co-founder of Zapier and now co-founder with Francois of both the ARC Prize and the research lab Ndea. With the launch of ARC Prize 2025 and ARC-AGI 2, they explain why existing LLMs fall short on true intelligence tests, how new models like O3 mark a step change in capabilities, and what it will really take to reach AGI.
We cover everything from the technical evolution of ARC 1 to ARC 2, the shift toward test-time reasoning, and the role of program synthesis as a foundation for more general intelligence. The conversation also explores the philosophical underpinnings of intelligence, the structure of the ARC Prize, and the motivation behind launching Ndea — a ew AGI research lab that aims to build a "factory for rapid scientific advancement." Whether you're deep in the AI research trenches or just fascinated by where this is all headed, this episode offers clarity and inspiration.
Ndea
Website - https://ndea.com
X/Twitter - https://x.com/ndea
ARC Prize
Website - https://arcprize.org
X/Twitter - https://x.com/arcprize
François Chollet
LinkedIn - https://www.linkedin.com/in/fchollet
X/Twitter - https://x.com/fchollet
Mike Knoop
X/Twitter - https://x.com/mikeknoop
FIRSTMARK
Website - https://firstmark.com
X/Twitter - https://twitter.com/FirstMarkCap
Matt Turck (Managing Director)
LinkedIn - https://www.linkedin.com/in/turck/
X/Twitter - https://twitter.com/mattturck
(00:00) Intro
(01:05) Introduction to ARC Prize 2025 and ARC-AGI 2
(02:07) What is ARC and how it differs from other AI benchmarks
(02:54) Why current models struggle with fluid intelligence
(03:52) Shift from static LLMs to test-time adaptation
(04:19) What ARC measures vs. traditional benchmarks
(07:52) Limitations of brute-force scaling in LLMs
(13:31) Defining intelligence: adaptation and efficiency
(16:19) How O3 achieved a massive leap in ARC performance
(20:35) Speculation on O3's architecture and test-time search
(22:48) Program synthesis: what it is and why it matters
(28:28) Combining LLMs with search and synthesis techniques
(34:57) The ARC Prize structure: efficiency track, private vs. public
(42:03) Open source as a requirement for progress
(44:59) What's new in ARC-AGI 2 and human benchmark testing
(48:14) Capabilities ARC-AGI 2 is designed to test
(49:21) When will ARC-AGI 2 be saturated? AGI timelines
(52:25) Founding of NDEA and why now
(54:19) Vision beyond AGI: a factory for scientific advancement
(56:40) What NDEA is building and why it's different from LLM labs
(58:32) Hiring and remote-first culture at NDEA
(59:52) Closing thoughts and the future of AI research
Previous Episode

Why This Ex-Meta Leader is Rethinking AI Infrastructure | Lin Qiao, CEO, Fireworks AI
In 2022, Lin Qiao decided to leave Meta, where she was managing several hundred engineers, to start Fireworks AI. In this episode, we sit down with Lin for a deep dive on her work, starting with her leadership on PyTorch, now one of the most influential machine learning frameworks in the industry, powering research and production at scale across the AI industry.
Now at the helm of Fireworks AI, Lin is leading a new wave in generative AI infrastructure, simplifying model deployment and optimizing performance to empower all developers building with Gen AI technologies.
We dive into the technical core of Fireworks AI, uncovering their innovative strategies for model optimization, Function Calling in agentic development, and low-level breakthroughs at the GPU and CUDA layers.
Fireworks AI
Website - https://fireworks.ai
X/Twitter - https://twitter.com/FireworksAI_HQ
Lin Qiao
LinkedIn - https://www.linkedin.com/in/lin-qiao-22248b4
X/Twitter - https://twitter.com/lqiao
FIRSTMARK
Website - https://firstmark.com
X/Twitter - https://twitter.com/FirstMarkCap
Matt Turck (Managing Director)
LinkedIn - https://www.linkedin.com/in/turck/
X/Twitter - https://twitter.com/mattturck
(00:00) Intro
(01:20) What is Fireworks AI?
(02:47) What is PyTorch?
(12:50) Traditional ML vs GenAI
(14:54) AI’s enterprise transformation
(16:16) From Meta to Fireworks
(19:39) Simplifying AI infrastructure
(20:41) How Fireworks clients use GenAI
(22:02) How many models are powered by Fireworks
(30:09) LLM partitioning
(34:43) Real-time vs pre-set search
(36:56) Reinforcement learning
(38:56) Function calling
(44:23) Low-level architecture overview
(45:47) Cloud GPUs & hardware support
(47:16) VPC vs on-prem vs local deployment
(49:50) Decreasing inference costs and its business implications
(52:46) Fireworks roadmap
(55:03) AI future predictions
Next Episode

Snowflake CEO on Winning the AI Arms Race
In this episode, we sit down with Sridhar Ramaswamy, CEO of Snowflake, for an in-depth conversation about the company’s transformation from a cloud analytics platform into a comprehensive AI data cloud. Sridhar shares insights on Snowflake’s shift toward open formats like Apache Iceberg and why monetizing storage was, in his view, a strategic misstep.
We also dive into Snowflake’s growing AI capabilities, including tools like Cortex Analyst and Cortex Search, and discuss how the company scaled AI deployments at an impressive pace. Sridhar reflects on lessons from his previous startup, Neeva, and offers candid thoughts on the search landscape, the future of BI tools, real-time analytics, and why partnering with OpenAI and Anthropic made more sense than building Snowflake’s own foundation models.
Snowflake
Website - https://www.snowflake.com
X/Twitter - https://x.com/snowflakedb
Sridhar Ramaswamy
LinkedIn - https://www.linkedin.com/in/sridhar-ramaswamy
X/Twitter - https://x.com/RamaswmySridhar
FIRSTMARK
Website - https://firstmark.com
X/Twitter - https://twitter.com/FirstMarkCap
Matt Turck (Managing Director)
LinkedIn - https://www.linkedin.com/in/turck/
X/Twitter - https://twitter.com/mattturck
(00:00) Intro and current market tumult
(02:48) The evolution of Snowflake from IPO to Today
(07:22) Why Snowflake’s earliest adopters came from financial services
(15:33) Resistance to change and the philosophical gap between structured data and AI
(17:12) What is the AI Data Cloud?
(23:15) Snowflake’s AI agents: Cortex Search and Cortex Analyst
(25:03) How did Sridhar’s experience at Google and Neeva shape his product vision?
(29:43) Was Neeva simply ahead of its time?
(38:37) The Epiphany mafia
(40:08) The current state of search and Google’s conundrum
(46:45) “There’s no AI strategy without a data strategy”
(56:49) Embracing Open Data Formats with Iceberg
(01:01:45) The Modern Data Stack and the future of BI
(01:08:22) The role of real-time data
(01:11:44) Current state of enterprise AI: from PoCs to production
(01:17:54) Building your own models vs. using foundation models
(01:19:47) Deepseek and open source AI
(01:21:17) Snowflake’s 1M Minds program
(01:21:51) Snowflake AI Hub
If you like this episode you’ll love
Episode Comments
Featured in these lists
Generate a badge
Get a badge for your website that links back to this episode
<a href="https://goodpods.com/podcasts/the-mad-podcast-with-matt-turck-320090/chasing-real-agi-inside-arc-prize-2025-with-chollet-and-knoop-88933212"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to chasing real agi: inside arc prize 2025 with chollet & knoop on goodpods" style="width: 225px" /> </a>
Copy