
Joscha Bach and Connor Leahy on AI risk
06/20/23 • 91 min
1 Listener
Support us! https://www.patreon.com/mlst MLST Discord: https://discord.gg/aNPkGUQtc5 Twitter: https://twitter.com/MLStreetTalk The first 10 mins of audio from Joscha isn't great, it improves after.
Transcript and longer summary: https://docs.google.com/document/d/1TUJhlSVbrHf2vWoe6p7xL5tlTK_BGZ140QqqTudF8UI/edit?usp=sharing Dr. Joscha Bach argued that general intelligence emerges from civilization, not individuals. Given our biological constraints, humans cannot achieve a high level of general intelligence on our own. Bach believes AGI may become integrated into all parts of the world, including human minds and bodies. He thinks a future where humans and AGI harmoniously coexist is possible if we develop a shared purpose and incentive to align. However, Bach is uncertain about how AI progress will unfold or which scenarios are most likely. Bach argued that global control and regulation of AI is unrealistic. While regulation may address some concerns, it cannot stop continued progress in AI. He believes individuals determine their own values, so "human values" cannot be formally specified and aligned across humanity. For Bach, the possibility of building beneficial AGI is exciting but much work is still needed to ensure a positive outcome. Connor Leahy believes we have more control over the future than the default outcome might suggest. With sufficient time and effort, humanity could develop the technology and coordination to build a beneficial AGI. However, the default outcome likely leads to an undesirable scenario if we do not actively work to build a better future. Leahy thinks finding values and priorities most humans endorse could help align AI, even if individuals disagree on some values. Leahy argued a future where humans and AGI harmoniously coexist is ideal but will require substantial work to achieve. While regulation faces challenges, it remains worth exploring. Leahy believes limits to progress in AI exist but we are unlikely to reach them before humanity is at risk. He worries even modestly superhuman intelligence could disrupt the status quo if misaligned with human values and priorities. Overall, Bach and Leahy expressed optimism about the possibility of building beneficial AGI but believe we must address risks and challenges proactively. They agreed substantial uncertainty remains around how AI will progress and what scenarios are most plausible. But developing a shared purpose between humans and AI, improving coordination and control, and finding human values to help guide progress could all improve the odds of a beneficial outcome. With openness to new ideas and willingness to consider multiple perspectives, continued discussions like this one could help ensure the future of AI is one that benefits and inspires humanity. TOC: 00:00:00 - Introduction and Background 00:02:54 - Different Perspectives on AGI 00:13:59 - The Importance of AGI 00:23:24 - Existential Risks and the Future of Humanity 00:36:21 - Coherence and Coordination in Society 00:40:53 - Possibilities and Future of AGI 00:44:08 - Coherence and alignment 01:08:32 - The role of values in AI alignment 01:18:33 - The future of AGI and merging with AI 01:22:14 - The limits of AI alignment 01:23:06 - The scalability of intelligence 01:26:15 - Closing statements and future prospects
Support us! https://www.patreon.com/mlst MLST Discord: https://discord.gg/aNPkGUQtc5 Twitter: https://twitter.com/MLStreetTalk The first 10 mins of audio from Joscha isn't great, it improves after.
Transcript and longer summary: https://docs.google.com/document/d/1TUJhlSVbrHf2vWoe6p7xL5tlTK_BGZ140QqqTudF8UI/edit?usp=sharing Dr. Joscha Bach argued that general intelligence emerges from civilization, not individuals. Given our biological constraints, humans cannot achieve a high level of general intelligence on our own. Bach believes AGI may become integrated into all parts of the world, including human minds and bodies. He thinks a future where humans and AGI harmoniously coexist is possible if we develop a shared purpose and incentive to align. However, Bach is uncertain about how AI progress will unfold or which scenarios are most likely. Bach argued that global control and regulation of AI is unrealistic. While regulation may address some concerns, it cannot stop continued progress in AI. He believes individuals determine their own values, so "human values" cannot be formally specified and aligned across humanity. For Bach, the possibility of building beneficial AGI is exciting but much work is still needed to ensure a positive outcome. Connor Leahy believes we have more control over the future than the default outcome might suggest. With sufficient time and effort, humanity could develop the technology and coordination to build a beneficial AGI. However, the default outcome likely leads to an undesirable scenario if we do not actively work to build a better future. Leahy thinks finding values and priorities most humans endorse could help align AI, even if individuals disagree on some values. Leahy argued a future where humans and AGI harmoniously coexist is ideal but will require substantial work to achieve. While regulation faces challenges, it remains worth exploring. Leahy believes limits to progress in AI exist but we are unlikely to reach them before humanity is at risk. He worries even modestly superhuman intelligence could disrupt the status quo if misaligned with human values and priorities. Overall, Bach and Leahy expressed optimism about the possibility of building beneficial AGI but believe we must address risks and challenges proactively. They agreed substantial uncertainty remains around how AI will progress and what scenarios are most plausible. But developing a shared purpose between humans and AI, improving coordination and control, and finding human values to help guide progress could all improve the odds of a beneficial outcome. With openness to new ideas and willingness to consider multiple perspectives, continued discussions like this one could help ensure the future of AI is one that benefits and inspires humanity. TOC: 00:00:00 - Introduction and Background 00:02:54 - Different Perspectives on AGI 00:13:59 - The Importance of AGI 00:23:24 - Existential Risks and the Future of Humanity 00:36:21 - Coherence and Coordination in Society 00:40:53 - Possibilities and Future of AGI 00:44:08 - Coherence and alignment 01:08:32 - The role of values in AI alignment 01:18:33 - The future of AGI and merging with AI 01:22:14 - The limits of AI alignment 01:23:06 - The scalability of intelligence 01:26:15 - Closing statements and future prospects
Previous Episode

Neel Nanda - Mechanistic Interpretability
In this wide-ranging conversation, Tim Scarfe interviews Neel Nanda, a researcher at DeepMind working on mechanistic interpretability, which aims to understand the algorithms and representations learned by machine learning models. Neel discusses how models can represent their thoughts using motifs, circuits, and linear directional features which are often communicated via a "residual stream", an information highway models use to pass information between layers.
Neel argues that "superposition", the ability for models to represent more features than they have neurons, is one of the biggest open problems in interpretability. This is because superposition thwarts our ability to understand models by decomposing them into individual units of analysis. Despite this, Neel remains optimistic that ambitious interpretability is possible, citing examples like his work reverse engineering how models do modular addition. However, Neel notes we must start small, build rigorous foundations, and not assume our theoretical frameworks perfectly match reality.
The conversation turns to whether models can have goals or agency, with Neel arguing they likely can based on heuristics like models executing long term plans towards some objective. However, we currently lack techniques to build models with specific goals, meaning any goals would likely be learned or emergent. Neel highlights how induction heads, circuits models use to track long range dependencies, seem crucial for phenomena like in-context learning to emerge.
On the existential risks from AI, Neel believes we should avoid overly confident claims that models will or will not be dangerous, as we do not understand them enough to make confident theoretical assertions. However, models could pose risks through being misused, having undesirable emergent properties, or being imperfectly aligned. Neel argues we must pursue rigorous empirical work to better understand and ensure model safety, avoid "philosophizing" about definitions of intelligence, and focus on ensuring researchers have standards for what it means to decide a system is "safe" before deploying it. Overall, a thoughtful conversation on one of the most important issues of our time.
Support us! https://www.patreon.com/mlst
MLST Discord: https://discord.gg/aNPkGUQtc5
Twitter: https://twitter.com/MLStreetTalk
Neel Nanda: https://www.neelnanda.io/
TOC
[00:00:00] Introduction and Neel Nanda's Interests (walk and talk)
[00:03:15] Mechanistic Interpretability: Reverse Engineering Neural Networks
[00:13:23] Discord questions
[00:21:16] Main interview kick-off in studio
[00:49:26] Grokking and Sudden Generalization
[00:53:18] The Debate on Systematicity and Compositionality
[01:19:16] How do ML models represent their thoughts
[01:25:51] Do Large Language Models Learn World Models?
[01:53:36] Superposition and Interference in Language Models
[02:43:15] Transformers discussion
[02:49:49] Emergence and In-Context Learning
[03:20:02] Superintelligence/XRisk discussion
Transcript: https://docs.google.com/document/d/1FK1OepdJMrqpFK-_1Q3LQN6QLyLBvBwWW_5z8WrS1RI/edit?usp=sharing
Refs: https://docs.google.com/document/d/115dAroX0PzSduKr5F1V4CWggYcqIoSXYBhcxYktCnqY/edit?usp=sharing
Next Episode
![undefined - [SPONSORED] The Digitized Self: AI, Identity and the Human Psyche (YouAi)](https://storage.googleapis.com/goodpods-images-bucket/episode_images/7e2b81cad770a29ac15a42c6cd8483316aea76b30c520168f3536e7ce300fc9a.avif)
[SPONSORED] The Digitized Self: AI, Identity and the Human Psyche (YouAi)
Sponsored Episode - YouAi What if an AI truly knew you—your thoughts, values, aptitudes, and dreams? An AI that could enhance your life in profound ways by amplifying your strengths, augmenting your weaknesses, and connecting you with like-minded souls. That is the vision of YouAi. YouAi founder Dmitri Shapiro believes digitizing our inner lives could unlock tremendous benefits. But mapping the human psyche also poses deep questions. As technology mediates our self-understanding, what risks rendering our minds in bits and algorithms? Could we gain a new means of flourishing or lose something intangible? There are no easy answers, but YouAi offers a vision balanced by hard thinking. Shapiro discussed YouAi's app, which builds personalized AI assistants by learning how individuals think through interactive questions. As people share, YouAi develops a multidimensional model of their mind. Users get a tailored feed of prompts to continue engaging and teaching their AI. YouAi's vision provides a glimpse into a future that could unsettle or fulfill our hopes. As technology mediates understanding ourselves and others, will we risk losing what makes us human or find new means of flourishing? YouAI believes that together, we can build a future where our minds contain infinite potential—and their technology helps unlock it. But we must proceed thoughtfully, upholding human dignity above all else. Our minds shape who we are. And who we can become.Digitise your mind today: YouAi - https://YouAi.aiMIndStudio – https://YouAi.ai/mindstudioYouAi Mind Indexer - https://YouAi.ai/trainJoin the MLST discord and register for the YouAi event on July 13th: https://discord.gg/ESrGqhf5CB TOC: 0:00:00 - Introduction to Mind Digitization 0:09:31 - The YouAi Platform and Personal Applications 0:27:54 - The Potential of Group Alignment 0:30:28 - Applications in Human-to-Human Communication 0:35:43 - Applications in Interfacing with Digital Technology 0:43:41 - Introduction to the Project 0:44:51 - Brain digitization and mind vs. brain 0:49:55 - The Extended Mind and Neurofeedback 0:54:16 - Personalized Learning and the Future of Education 1:02:19 - Privacy and Data Security 1:14:20 - Ethical Considerations of Digitizing the Mind 1:19:49 - The Metaverse and the Future of Digital Identity 1:25:17 - Digital Immortality and Legacy 1:29:09 - The Nature of Consciousness 1:34:11 - Digitization of the Mind 1:35:06 - Potential Inequality in a Digital World 1:38:00 - The Role of Technology in Equalizing or Democratizing Society 1:40:51 - The Future of the Startup and Community Involvement
If you like this episode you’ll love
Episode Comments
Featured in these lists
Generate a badge
Get a badge for your website that links back to this episode
<a href="https://goodpods.com/podcasts/machine-learning-street-talk-mlst-213859/joscha-bach-and-connor-leahy-on-ai-risk-30875608"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to joscha bach and connor leahy on ai risk on goodpods" style="width: 225px" /> </a>
Copy