Thomas Wolf of Hugging Face on Dario's "On DeepSeek and Export Controls"

01/31/25 • 6 min

Gist: Hugging Face's Thomas Wolf challenges closed-source AI model comparisons by emphasizing the global, innovative potential of open-source AI technologies like DeepSeek.

Thomas Wolf's original tweet at: https://x.com/Thom_Wolf/status/1885093269022834943

Summary: Thomas Wolf of Hugging Face critically reviews Dario's essay about DeepSeek and export controls, expressing skepticism about the essay's claims regarding the superiority of closed-source AI models. He challenges the comparison between DeepSeek and other frontier models, pointing out that the arguments rely heavily on internal, unpublished evaluations and vague comparisons that lack substantial evidence.

Wolf emphasizes the significance of open-source AI models, arguing that the open nature of DeepSeek fundamentally undermines the notion of a closed, geographically constrained AI race. He highlights that open-source models can be downloaded and used globally, with contributors from diverse regions adapting and improving upon the original model, which promotes technological innovation and accessibility.

The discussion extends to the broader implications of open-source technology for AI's future, with Wolf advocating for a global perspective on AI development. He stresses that open-source models offer crucial advantages like resilience, distributed computing, and the ability to run models locally, which will become increasingly important as AI becomes more deeply integrated into society's technological infrastructure.

Key Figures & Topics: Anthropic, Mistral, hugging face, CrowdStrike, Claude, Open Source, WhatsApp, Allen AI, Thomas Wolf, DeepSeek, AI, Export Control, resilience, technology, open source

1-liners:

"Open source knows no border both in its usage and its creation. Every company in the world, be it in Europe, Africa, South America or the usa, can now directly download and use DeepSeek without sending data to a specific country." - Thomas Wolf
"Open source has many advantages like shared training, costs, tunability, control, ownership, privacy. But one of its most fundamental virtues in the long term as AI becomes deeply embedded in our world will likely be its strong resilience." - Thomas Wolf
"More than national prides and competitions, I think it's time to start thinking globally about the challenges and social changes that AI will bring everywhere in the world." - Thomas Wolf
"Without access to the Internet we lose all our social media news feeds, can't order a taxi, book a restaurant or reach someone on WhatsApp." - Thomas Wolf
"Open source technology is likely our most important asset for safely transitioning to a resilient digital future where AI is integrated into all aspects of society." - Thomas Wolf

tldr; / tldlisten;

Thomas Wolf critiques Dario's essay comparing DeepSeek and closed-source AI models, arguing the comparison relies too heavily on unpublished internal evaluations
Open-source AI models like DeepSeek offer global accessibility, allowing companies worldwide to download and use the technology without geographic restrictions
Open-source technology provides crucial resilience in AI development, preventing over-reliance on single companies or data centers
The AI ecosystem is increasingly global, with contributors and model developments emerging from teams across different countries like the US, Europe, and elsewhere
Open-source AI models offer multiple advantages including shared training costs, tunability, control, ownership, and privacy
As AI becomes more integrated into daily life, open-source approaches will be critical for creating robust, distributed technological infrastructure
Recent open-source model releases by teams like Allen AI and Mistral demonstrate the rapid innovation happening outside closed-source environments
National competition in AI should be replaced by a more global perspective focused on safely integrating AI technologies across societies

Gist: Hugging Face's Thomas Wolf challenges closed-source AI model comparisons by emphasizing the global, innovative potential of open-source AI technologies like DeepSeek.

Thomas Wolf's original tweet at: https://x.com/Thom_Wolf/status/1885093269022834943

Key Figures & Topics: Anthropic, Mistral, hugging face, CrowdStrike, Claude, Open Source, WhatsApp, Allen AI, Thomas Wolf, DeepSeek, AI, Export Control, resilience, technology, open source

1-liners:

"Open source knows no border both in its usage and its creation. Every company in the world, be it in Europe, Africa, South America or the usa, can now directly download and use DeepSeek without sending data to a specific country." - Thomas Wolf
"Open source has many advantages like shared training, costs, tunability, control, ownership, privacy. But one of its most fundamental virtues in the long term as AI becomes deeply embedded in our world will likely be its strong resilience." - Thomas Wolf
"More than national prides and competitions, I think it's time to start thinking globally about the challenges and social changes that AI will bring everywhere in the world." - Thomas Wolf
"Without access to the Internet we lose all our social media news feeds, can't order a taxi, book a restaurant or reach someone on WhatsApp." - Thomas Wolf
"Open source technology is likely our most important asset for safely transitioning to a resilient digital future where AI is integrated into all aspects of society." - Thomas Wolf

tldr; / tldlisten;

Thomas Wolf critiques Dario's essay comparing DeepSeek and closed-source AI models, arguing the comparison relies too heavily on unpublished internal evaluations
Open-source AI models like DeepSeek offer global accessibility, allowing companies worldwide to download and use the technology without geographic restrictions
Open-source technology provides crucial resilience in AI development, preventing over-reliance on single companies or data centers
The AI ecosystem is increasingly global, with contributors and model developments emerging from teams across different countries like the US, Europe, and elsewhere
Open-source AI models offer multiple advantages including shared training costs, tunability, control, ownership, and privacy
As AI becomes more integrated into daily life, open-source approaches will be critical for creating robust, distributed technological infrastructure
Recent open-source model releases by teams like Allen AI and Mistral demonstrate the rapid innovation happening outside closed-source environments
National competition in AI should be replaced by a more global perspective focused on safely integrating AI technologies across societies

Previous Episode

On DeepSeek and Export Controls by Dario Amodei

Summary: DeepSeek's AI advancements demonstrate the ongoing evolution of AI technology and underscore the strategic importance of export controls in managing global technological competition.

"On DeepSeek and Export Control" is on Dario Amodei's blog at: https://darioamodei.com/on-deepseek-and-export-controls

Deeper Summary: Dario Amodei discusses the recent developments of DeepSeek, a Chinese AI company that has produced models approaching the performance of US AI models at a lower cost. He explains three key dynamics of AI development: scaling laws, continuous innovation that shifts efficiency curves, and emerging paradigms like reinforcement learning for improving model reasoning. The key point is that while DeepSeek's achievements are impressive, they are largely within expected technological progression rather than a revolutionary breakthrough.

Amodei argues that DeepSeek's models, particularly DeepSeek V3 and R1, represent an expected point on the ongoing AI cost reduction curve. While the company has achieved notable efficiency in model training, their performance is roughly in line with historical trends of cost reduction in AI development. He emphasizes that DeepSeek is not fundamentally changing the economics of large language models, but is instead demonstrating the first time a Chinese company has been at the forefront of these expected technological improvements.

The speaker's primary focus is on the geopolitical implications of AI development and the critical importance of US export controls on advanced chips. Amodei argues that these controls are essential in determining whether the world will be unipolar (with the US leading) or bipolar (with both US and China having powerful AI). He contends that well-enforced export controls can prevent China from obtaining millions of advanced chips, potentially preserving a technological advantage for democratic nations and mitigating risks of an authoritarian government gaining transformative AI capabilities.

Key Figures & Topics:
Artificial Intelligence, OpenAI, Anthropic, Nvidia, GPT-4, XAI, Deepseek, Dario Amodei, H100, Claude 3.5 Sonnet, Export Controls, AI, Export Controls, DeepSeek, Geopolitics, Scaling, Technology

1-liners:
"Export controls serve a vital purpose keeping democratic nations at the forefront of AI development." - Dario Amodei

"Making AI that is smarter than almost all humans at almost all things will require millions of chips, tens of billions of dollars at least, and is most likely to happen in 2026-2027." - Dario Amodei

"We could end up in one of two starkly different worlds in 2026-2027: a bipolar world where both the US and China have powerful AI models, or a unipolar world where only the US and its allies have these models." - Dario Amodei

"Well enforced export controls are the only thing that can prevent China from getting millions of chips and are therefore the most important determinant of whether we end up in a unipolar or bipolar world." - Dario Amodei

"The economic value of training more and more intelligent models is so great that any cost gains are more than eaten up almost immediately. They're poured back into making even smarter models for the same huge cost we were originally planning to spend." - Dario Amodei

tldr; /.tldlisten;
DeepSeek's recent AI model releases demonstrate China's growing technological capabilities, but are largely within expected cost reduction trends for AI development

Export controls on advanced computer chips are crucial in determining whether the global AI landscape will be unipolar (US-dominated) or bipolar (US and China at parity)

The current AI development trajectory suggests models approaching human-level intelligence could emerge around 2026-2027, requiring billions of dollars and millions of chips

AI scaling follows a predictable pattern where cost efficiency gains are typically reinvested into training even more advanced models, not reducing overall spending

Reinforcement learning for improving AI reasoning is currently in an early stage, allowing multiple companies to quickly develop competitive models

While DeepSeek demonstrates impressive engineering, their models are not fundamentally revolutionizing AI economics, but represent an expected incremental advancement

China could potentially gain significant strategic advantages if they match US AI capabilities, particularly in military and technological applications

Well-enforced export controls can prevent China from acquiring millions of advanced AI chips, potentially maintaining a US technological lead

Next Episode

What It Takes To Onboard Agents by Anna Piñol at NfX

Gist: Explores the challenges of AI agent adoption, identifying critical infrastructure needs like accountability, context understanding, and coordination to transform AI from experimental technology to practical, trustworthy workplace tools.

An AI voice reading of: "What It Takes To Onboard Agents" by Anna Piñol at NfX

Key Figures & Topics: Gemini, GPT-4, Large language models, McKinsey, UiPath, Claude, NFX, ElevenLabs, Robotic Process Automation, Blue Prism, Anna Pinole, David Villalon, Manuel Romero, Misa, Workfusion, AI, automation, Agents, infrastructure, Enterprise

Summary:
The podcast explores the current state of AI agents and the challenges in their widespread adoption. Despite rapid technological progress in AI capabilities, there is a significant gap between the intent to implement AI in organizations and actual implementation. The NFX representatives discuss how moving from traditional Robotic Process Automation (RPA) to Agentic Process Automation (APA) requires solving key infrastructure challenges.

To bridge the adoption gap, the episode identifies three critical layers needed for AI agent implementation: the accountability layer, the context layer, and the coordination layer. The accountability layer focuses on creating transparency and verifiable work, allowing organizations to understand and audit AI decision-making processes. The context layer involves developing systems that help AI agents understand a company's unique culture, goals, and unwritten knowledge, making them more adaptable and intelligent.

The final discussions center on the future of AI agents, emphasizing the need for interoperability, tools, and a collaborative ecosystem. The speakers predict a future where businesses will manage teams of AI agents across various functions, with the potential for agents to communicate, collaborate, and even exchange services. They highlight that solving these infrastructural challenges will be crucial in transforming AI agents from experimental technology to trusted, everyday tools.

1-liners:

"We are moving from robotic process automation to an agentic process automation."
"The world where we are all using AI agents each day is an inevitability."
"63% of leaders thought implementing AI was a high priority, but 91% of those respondents didn't feel prepared to do so."
"The key is reducing the risks, real and perceived, associated with implementation."
"A lot of what we learn at a new job isn't written down anywhere. It's learned by observation, intuition, through receiving feedback and asking clarifying questions."

too long didn't listen (tldl;)

The AI agent ecosystem is currently missing three critical infrastructure layers: accountability, context, and coordination, which are necessary for widespread enterprise adoption
Unlike Robotic Process Automation (RPA), AI agents powered by Large Language Models (LLMs) can handle more complex, unstructured tasks with greater adaptability
Enterprises need transparency in AI processes, requiring a 'chain of work' that shows exactly how and why an AI agent makes specific decisions
Successful AI agents must understand an organization's unique culture, communication style, and unwritten knowledge, not just follow rigid rules
The future of work will likely involve managing teams of AI agents across different business functions, requiring robust inter-agent communication and coordination systems
Building trust is crucial for AI agent adoption: organizations want systems that reduce implementation risks and provide verifiable, auditable outcomes
The emerging 'Business to Agent' (B2A) tooling ecosystem will be critical in empowering AI agents to become more autonomous and capable
While AI agent technology is progressing rapidly, there remains a significant gap between technological potential and actual enterprise implementation