Sebastian Mallaby on AI Safety and the Race for Superintelligence

The Good Fight

Preview

0:00

-53:08

Sebastian Mallaby on AI Safety and the Race for Superintelligence

Yascha Mounk and Sebastian Mallaby discuss why tech leaders both fear and accelerate dangerous AI development, and whether open-source models pose unacceptable risks.

Yascha Mounk

Apr 04, 2026

∙ Paid

Sebastian Mallaby is the author of several books including The Infinity Machine: Demis Hassabis, DeepMind, and the Quest for Superintelligence. A former Financial Times contributing editor and two-time Pulitzer Prize finalist, Mallaby is the Paul A. Volcker Senior Fellow for International Economics at the Council on Foreign Relations.

In this week’s conversation, Yascha Mounk and Sebastian Mallaby discuss why AI developers simultaneously fear and advance potentially dangerous technology, whether open-source AI models pose unacceptable security risks, and how China and the United States differ in their approaches to AI safety.

This transcript has been condensed and lightly edited for clarity.

Yascha Mounk: There’s something that struck me about Silicon Valley in general, and that in your book you really get to the heart of through the lens of one specific character, which is that so many people in Silicon Valley both seem to believe that artificial intelligence is a miraculous technology with lots of good things, but also a really dangerous technology—technology that could potentially kill all of humanity. Some of the major efforts at advancing artificial intelligence were actually motivated by trying to understand this technology and make it develop in such a way that it would be safe. Yet those same people seem to be at the forefront of developing the very technology that they warn could destroy the world.

How should we think about that? If it was just one person, you might think it’s slightly schizophrenic, but you see it emerging again and again as a theme in different contexts—through the founding story of OpenAI and Sam Altman, but obviously also in the story of DeepMind, which you tell in your new book

Sebastian Mallaby: Yeah, you’re quite right. It was stunning to me that if you think about any of the early labs—like DeepMind, founded in 2010—the two scientific founders, Shane Legg and Demis Hassabis, meet each other at a safety lecture. Then you look at 2015: the early discussions between Elon Musk and Sam Altman about founding OpenAI were all about safety, about being responsible, being more safe than DeepMind. It was like a competitive safety thing.

Then you go forward and the next lab that gets started is Anthropic, which is started as a splinter group that thinks that OpenAI is not safe enough. So they kind of repeat the story that had happened earlier in the rivalry between OpenAI and DeepMind. Each of these labs began with the idea that they were going to be safer. Even Elon Musk, when he does Grok or XAI, he comes in with a record of having proclaimed his terror of existential risk from AI, going back to 2012 when he first met Demis Hassabis.

So you’re right, they’re all schizophrenic and it’s a pattern. How do we think about that? Well, in the end, my feeling is that it’s an enlarged version of all of us. You and I are excited by technology and also scared by it. Yet we take the trade, we move ahead. Why do we do this? Because we’re human. If we didn’t do it, humans would still be living in caves. We do accept technological risk, and that’s what these guys are doing.

Mounk: One of the leitmotifs of your book is this line that Geoffrey Hinton—a past guest of this podcast—says, alluding to Oppenheimer, that the thrill of discovery is so big that even if you’re very worried about its implications, it’s impossible to resist. I wonder, from your conversations with Demis and with others in the space, whether there is an ability to govern this technology.

I was really interested when I had Nate Soares on the podcast—co-author with Eliezer Yudkowsky of the book If Anyone Builds It, Everyone Dies. He seemed to me to be too pessimistic about the prospects of technological annihilation; he basically thought AI is definitely going to try and kill us. Then I thought he was really optimistic about our ability to stand up to that through public policy—that we’ll get just the right incentives in place so that nobody builds the machine that definitely would kill us if we did build it. I was struck by how pessimistic he was on the first point and how optimistic he was on the second point.

OpenAI gets founded to be really safe, Anthropic is a spin-off of OpenAI in a sense, a hostile spin-off of people who get worried about OpenAI, and now Anthropic is in some ways the leader of the pack. Does it just mean that our fate depends on what the natural tendency of this technology is going to be?

Mallaby: No, I think that’s too fatalistic. I agree with you that there is, in the Yudkowsky view, an extreme caricaturing of both the level of the risk—a 100% probability of doom, which to my mind is just ridiculously high—and at the same time, too much optimism about the ability of our policies to do something about it. It reminds me of Jeffrey Sachs arguing about development aid 20 years ago, where he would always stress how deeply poor and dysfunctional developing countries were, and then he would say, but if you give them a lot of aid, we can fix all of it. Both sides were wrong. I think that’s a classic posture of somebody who’s arguing for radical government action: to exaggerate the problem and then underplay the policy challenges of getting it right.

But I don’t think it’s correct to say we are hostage to some technology over which we have zero control, because there are ways of controlling it—you can control both the coding of it, the algorithmic design of it. There’s a whole field of alignment research to make large language models align with human priorities, and if more investment was going into that, we would have a better shot at aligning them better. That is something that can be done and worked on.

We hope you’re enjoying the podcast! If you’re a paying subscriber, you can set up the premium feed on your favorite podcast app at writing.yaschamounk.com/listen. This will give you ad-free access to the full conversation, plus all full episodes and bonus episodes we have in the works! If you aren’t, you can set up the free, limited version of the feed—or, better still, support the podcast by becoming a subscriber today!

Set Up Podcast

If you have any questions or issues setting up the full podcast feed on a third-party app, please email leonora.barclay@persuasion.community

An example of this would be the UK AI Security Institute, which does some of this alignment research. They once discovered a way that you could hack any of the leading large language models with a specific phrase that would unlock it to do things that were supposed to be off limits. Once you discover that, you tell the labs and they fixed the loophole, they fixed the security vulnerability. The point is you can do something on that algorithmic front. You can also do things in policy terms.

Open source and open weight AI models are ridiculously dangerous. Why would you ever allow this kind of technology to circulate without any ability to call it back if somebody starts to use it for a massive cyber attack on infrastructure? It’s crazy.

Mounk: Can you explain to people a little bit about open source? I think people may not be so aware both about what it is in general and what it is in relation to AI.

In general, open source has always been the idealistic, do-gooder approach to software—you can customize it yourself, you’re not in the hands of some corporate conglomerate. We’ve seen a tendency towards the AI labs that are perhaps not at the very cutting edge—that can create very powerful models, but models that are just a little bit less powerful than the most cutting edge ones—going open source, because it is a way to attract people to those models and make them use them.

As you’re saying, it means that basically you can download a model to your own machine and run it off your own machine in a way that is no longer subject to control by its original creator. In lots of contexts, that’s going to have positive elements—it makes it cheaper to use the technology. It means that, for example, if I wanted to handle some really sensitive information, if I wanted to create an index for a publisher and they don’t want to have any risk of somebody being able to train their AI models on that information, I can do that with an open source model because it would be closed loop. I know that the information is not being communicated back to anybody because I’m not running it on the internet.

So there are lots of positive things. But you are saying, interestingly, that in terms of risk mitigation, it’s really bad—because those same exact features also mean that if somebody starts using this to develop a really potent bioweapon, there’s no way that anybody can track that that’s what they’re using it for, or stop them from doing it.

Mallaby: There was a cyber attack in Mexico recently on a mass scale—pretty much everybody’s electoral records were hacked. This was done with the help of Anthropic. I’ve heard accounts that it was more than just help; basically a group used Claude to carry out this attack and Claude did most of the hacking for them. There was a bit of OpenAI being used as well, the ChatGPT model. Once the labs discovered that this attack was going on, they shut it down, because they had the ability to do that—it was not open weight. It was not on the machine of the bad guys; it was being used through a server controlled by the labs. That’s a real-life case of how you could shut down an attack.

I’ve been to war-gaming exercises organized by the RAND Institute, and the classic nightmare scenario is that you have rolling waves of attacks on Western infrastructure by some unidentified attacker—it could be the Chinese government, it could be a terrorist group, you just don’t know. All you know is that all your infrastructure is not working: nobody has clean water, nobody has electricity, everybody’s panicking, and you don’t know how to stop it because you don’t know who the assailant is.

Why would we risk this? We have war games telling us how this would work. We’re staring this in the face and we should be doing something about open weight models. It’s not easy because they circulate already, but at least stop the more powerful ones which are going to be created starting now.

Mounk: Llama, the models by Meta, have been open source. The Chinese have released a lot of open source models as well. Companies like DeepSeek, as well as the many other companies in China that now have pretty powerful AI models, have created models that are very capable and very powerful, but they’re not as capable as the latest versions of ChatGPT, Claude, and Gemini. There is a kind of competitive reason to release open source models if you’re a little bit behind, because nobody is going to pay a premium dollar to have access to your model—they’re going to use the most high-performing one. But these models are powerful enough that if you can make them available to people much cheaper, they’re going to get a lot of use, and that’s good for visibility and all kinds of other things.

It is striking that the Chinese government, which in general is obviously quite conscious about its ability to control what’s going on in its country and in some ways around the world, has allowed that to go forward. That speaks to a broader conventional wisdom: that one of the dynamics here is a competitive race between the United States and China, and that supposedly China is less interested in AI safety than the United States is. I think you’re not so certain about that conventional wisdom.

Mallaby: A standard view in U.S. government circles is: the Soviet Union and the United States lived through the Cuban Missile Crisis, they understood the existential risk of nuclear technology, they get that some weapons can be existential. The Chinese, on the other hand—their idea of catastrophic 20th century risk is the Great Famine, the Cultural Revolution, a politically-originated disaster. It’s not technological near-misses. To the contrary, in China, technology is associated with miraculous growth over the last 25 years. They want technology, they love technology. That’s the classic story, and if you try to talk to them about limiting technology, forget it.

I just spent eight days there and was really struck by how both top research academics and leaders of the industrial AI companies were talking about safety. When I was there, there was this furore over Open Claw—an agent you can download onto your computer and let it do agentic things: manage your email, do your shopping, whatever. It’s pretty good technically, but it’s also dangerous because you have to make your computer naked to this agent and you don’t know what it’s going to do with all your data. It’s some sort of open source code, so who knows what it’s going to do.

Auf deutsch lesen 🇩🇪

Lire en français 🇫🇷

All of the Tsinghua professors and other AI researchers I was talking to were saying people shouldn’t be downloading this—and yet there were lines of ordinary Chinese people outside the Tencent headquarters, waiting for engineers to help install it on their laptops. In the end, the government weighed in and said people should not be installing this. I think the debate is tipping in China. The Chinese state is not averse to regulating things—it has regulated the internet a great deal. Why wouldn’t they want to control open source models, which could become dangerous? It seems to me too defeatist to just assume they won’t.

Mounk: Part of the point here may simply be that the word “safety” is a very broad and capacious term, and the argument has been made that people just mean very different things by it. Part of it is that the United States is very influenced by science fiction, and so the kind of safety we imagine is the rise of the robots—will the robots try to kill us? Even at a slightly lower level, things like the risk of engineered bioweapons and so on.

In China, when people talk about AI safety, part of what they mean is political safety—that these machines need to be aligned with a particular set of views, not give too much information about particular historical events, and portray a positive image of the Chinese Communist Party. Of course, a lot of what happens in post-training in the United States is to make sure that Gemini, ChatGPT, Claude, and so on don’t step on various social and political taboos as well. A lot of the work in post-training goes into making sure that chatbots know not to use certain kinds of slurs and not to wade into territory that might be politically sensitive.

But have you found in your conversations in China that when you say the words “AI safety,” people just mean something different by it? Or do you think that difference has been overstated?

Mallaby: There are lots of definitions of safety, but I was talking specifically about alignment risk—the idea that the robots will not be aligned with what humans want and will actually act against humans. That’s what people were talking to me about. That was the point about the Open Claw thing: people were installing an agent which might just take it upon itself to do something which is not good for the user.

It’s not about political debates on what the large language model should or shouldn’t say. It’s actually about this more existential thing. Closely related to that is the point that it’s not just the machine that can be a bad actor—a human bad actor can get hold of it and use it to make a bioweapon or something of that kind.

Mounk: Those two risks are obviously interlinked but important to distinguish. What kind of levers does public policy have to govern the existential risks of AI safety, or the more day-to-day, bread-and-butter risks?

One problem is simply one of competition. I’ve had conversations with people in the European Union who I found to be slightly naive about this—who believe that the Brussels effect, which allows them to set a standard for how a car works and thereby creates a big incentive for car manufacturers to apply the same standards across different manufacturing contexts, means that the EU as a significant market can force a regulation that will by and large be listened to in the United States and other places as well. In the context of AI, where there simply aren’t very significant European AI companies so far—certainly not on a global scale—that logic breaks down. DeepMind is an interesting case because it started in the UK, but it’s now owned by a US conglomerate. Certainly when we talk about members of the European Union, there aren’t many significant AI players. There’s Mistral, and there’s the new venture that Yann LeCun is founding, but relative to US and Chinese entities, these are very small players. If somebody designs a bioweapon on one of those open source machines, that’s not going to stop at the frontier of the European Union.

There’s also a broader problem of how do you actually make the technical rules. What kind of rules would allow us to achieve alignment? Do you have a view on how to even begin thinking about this policy space?

Mallaby: These are two very different points. Let me just focus on the open source question.

The first thing to say about the Europeans is that they’re happy to tolerate open source because Mistral is producing open source—Mistral is in precisely the position mentioned earlier: if you’re not at the frontier and you can’t compete on quality, you compete on availability and you make it open source. That is the French strategy. They haven’t even got to the threshold of being serious about open source before they start proclaiming the extraterritoriality of their regulation.

Trying to stop open source is non-trivial. There’s a huge lobby of companies like Facebook Meta that create it, and in Europe, the French government wants Mistral to succeed and so wants to support its open source tactics. But if there was a policy shift and governments decided they wanted to control it, you would simply say to labs that were in your jurisdiction, or that wanted to do business in your jurisdiction, that they can’t be open source—at least if it’s a frontier model. There are going to be academic models that are experimental and much smaller, and those will be open source, and that’s probably fine. But the big frontier models are a different matter.

If you’re Mistral and you want to sell to U.S. consumers, you probably want to raise capital in the United States—you have lots of touch points with the United States, so you’re going to comply with an American regulatory decision. The big problem is you have to get China on board, because they are an ecosystem unto themselves. The assumption in the United States is there’s no point even trying that conversation. Based on just having been there, I’m saying: no, there is a point. I don’t think Trump is going to pursue that, because he’s not interested in anything other than AI acceleration.

Mounk: He also fundamentally believes in a zero-sum world, right? He thinks that most deals have a very clear winner and a clear loser, which doesn’t make it very appealing to try and strike a deal the point of which would be that both sides can win from it.

Mallaby: There’s an opportunity here for a Mark Carney-style middle powers initiative. He made a famous speech at Davos where he said that Canada and the Europeans can’t rely on the United States—they need to get together and act collectively rather than wait for America, because that’s just not going to be useful for the next three years. On AI, that is something where Europe, and maybe Britain in particular, could start a discussion with the Chinese about how to think about open source. The UK has some credibility on this, both because DeepMind is based there and because their AI Security Institute is extremely good. You begin the discussion—you can’t consummate it until there is a new American president, but that will come at some point. It’s worth getting that conversation started.

Mounk: Open source is one element of this. What about more broadly? If you imagined a real deal between a new U.S. president and perhaps a new leader in China—both serious about these things and recognizing that this is a genuine risk to humanity across borders—what could actually govern these technologies? We probably don’t want to shut AI down: it’s not feasible, and there are also tremendous benefits that can come from it, in medicine and science more broadly, that we don’t want to foreclose. At the same time, the people who have created these technologies are themselves extremely worried about how harmful the technology could be to humanity.

Put aside for the moment all political constraints and imagine that we can write the deal and it will be approved. Do we actually know what that framework would entail? Because it feels like there are very, very knotty intellectual questions even about what kind of rules and regulation would, A) effectively control those existential risks, and B) do that without forestalling all of the potential benefits from the technology.

Mallaby: Demis Hassabis, the central character in my book about AI, has for quite a long time advocated what he calls a CERN—as in the Centre for European Nuclear Research—a kind of governing body which would oversee AI on a multilateral basis, propose policies and enforce them, or at least set the policies while national governments do the enforcing.

The elements of that policy would be: no open source, at least not for big models; and more investment in alignment, a branch of science and engineering that needs resources. You might also need tax incentives to require that private labs, whenever they spend a billion dollars on a training run to make a model more powerful, set aside, let’s say, 20% of that specifically for safety research.

The point is that safety is both a private good and a public one. If you’re Google, you don’t want your customers to feel the technology is unsafe, so you have some private incentives to invest in safety. But there are also spillovers into broader societal risk—infrastructure collapsing, terrorist groups being empowered—which are public goods, not private ones. Therefore the public authority needs to ensure that the level of investment in safety rises to the socially optimal amount. That implies either government spending on research and engineering into alignment, or taxing the labs and nudging them to do it.

So there are two important things: don’t do open source for frontier models, and do more research on alignment. The third thing is that, just as the Food and Drug Administration looks at drugs and determines if they’re safe to be released to market, there should be an equivalent body for AI models—they should be assessed, red-teamed, and only then released to consumers. That doesn’t exist anywhere in the world at the moment, which is crazy. But I think we’re going to get there in the end, because either governments will change their minds and do it, or there will be a Three Mile Island-type disaster and the public will demand it. It’s really a question of whether we do it before or after we suffer some AI catastrophe.

Mounk: You mentioned P(doom) earlier—the probability that this incredibly capable new technology will lead, however you want to define doom, at the most extreme level to the death of humanity, or at the somewhat less extreme level to the enslavement of humans by our new AI overlords, which ironically we ourselves as a species have created. Where do you put P(doom) after thinking about these topics for the last few years?

Mallaby: I’ll tell you a little bit about the journey. I began researching my book on DeepMind right around the time ChatGPT came out in late 2022. I already thought that machines were obviously going to be more intelligent than humans, and that normally more intelligent beings or agents will dominate the less intelligent ones. I could see that there was a theoretical risk, but I consoled myself with the idea that although the machines are more intelligent than us, or will be soon, they don’t have an incentive to dominate us. We have evolved over centuries and centuries to want to survive, pass on our DNA, and to fight viciously to be able to do that. Machines don’t reproduce in that way, they’re not evolved in that way, they don’t have a survival instinct—therefore, even if they’re cleverer, they won’t dominate us.

The moment I lost my faith in that argument was when I went to see Geoffrey Hinton, who was on your podcast, and sat in his kitchen in Toronto for two hours. What he pointed out to me is: you’re going to have a powerful AI in the future, and you’re going to be worried that your enemy is going to mount a cyber attack on it. As a human, you’re way too slow to respond to a cyber attack, so you’re going to empower your own AI to defend itself. Once you’ve done that, you’ve necessarily given it a sense of self-preservation, a sense of pain, a sense of fear, a sense of proactive defense. You’ve erased the distinction on which I had based my confidence. Now it does have a survival instinct—and it’s cleverer than us.

The bottom line on the P(doom) question is really a Rorschach test. People give a high P(doom) if they are temperamentally pessimistic about life and the world. Hinton—one of his former PhD students told me—was always worried about bioweapons finishing off humanity before he thought AI would do it. He always had doom about something. In my case, analytically I see the case for being very worried; temperamentally and emotionally, I just can’t get there.

This came up when I was writing my book. I wanted to explore what it feels like to be creating an existential technology—what is the sensation that you might be destroying humanity with what you’re doing? What was at first crazy, but on reflection not surprising, is that people would give safety lectures describing the possibility of human annihilation while smiling, even laughing at points. Contemplating the annihilation of humans feels absurd, and the absurd is a close cousin of humor. There is a sort of fascination—people are drawn like moths to the fire by catastrophe scenarios. It’s something deep in the human tradition: all the second coming predictions, the apocalypse, the religious iconography.

The first revelation of my book research was that not only Demis Hassabis and DeepMind, but all the other labs began—as we said earlier—with a strong perception that this could be disastrous. At the same time, they processed this feeling of disaster sometimes with laughter, and at other times, even when they were being serious and trying to think about how to fix it, they would go through experiment after experiment, hypothesis after hypothesis about how to safeguard it—and none actually worked.

With Demis, he wanted to negotiate with Google when Google bought DeepMind in 2014. He wanted to oblige them to have an independent safety oversight board that would have the final say on the rollout of AI, so that it wouldn’t be in the hands of a corporate board driven by profit. Initially Google agreed and held a first meeting, but that was a disaster because it was chaired by Elon Musk, who had set up OpenAI to compete with DeepMind. What followed was a secret negotiation—something called Project Mario—which I discovered and was given leaked documents about from inside the company. What you see there is three years of negotiations between Mountain View and DeepMind in London on how to put AI governance in place within a for-profit company. In the end it didn’t work, because the for-profit company couldn’t accept the idea of empowering outside independent figures who would have a say over their proprietary technology. There are other iterations of these experiments throughout the story of DeepMind—how do you make AI good, how do you make it beneficial for the world? It’s extraordinarily difficult. That’s why, in the end, we correctly talk about governments intervening. It needs to be policy, and it needs to be policy on an international level, with the U.S. and China talking to each other.

Mounk: One of the things that strikes me as interesting is that humanity has always thought that doom is upon us. From the story of the flood in the Bible to basically every junction in history, there has been some kind of millenarian cult saying the destruction of humanity is upon us—and often probably with a smile on their face. There is probably something in the human psyche which makes that thought both scary and exhilarating. Part of what’s exhilarating, of course, is that you stand at the cusp of history, that it’s your generation that is going to be involved in the final battle, and therefore your life matters. There’s a strange set of assumptions there.

It would be easy to use the historical context to ridicule any of these concerns. Humanity has been around for however many million years—human history is at least 10,000 years old—why should it so happen that you and I are alive at the very moment when the technology that is literally going to bring about doom is being created? Isn’t that just our ridiculous need for significance in the world seducing us into making those assumptions?

At the same time, you look at how quickly this technology is evolving, how powerful it is, the fact that it is the first time there is—or is going to be, depending on your exact interpretation—a technology that is objectively more capable than humans. The thought that that could all go horribly wrong is not exactly far-fetched. It’s hard to reconcile this fear of chronocentricity—the tendency to center our own time as the most important in the world—with the very cold, rational recognition that humanity now, from nuclear weapons to biotechnological capabilities to AI, has created tools and machines so vastly more powerful than anything that existed for 99.9999% of human history that there is real reason to think that this time it may actually be different.

Mallaby: Chronocentricity—we exaggerate the importance and uniqueness of our own time. It’s a kind of “this time is different”-ism.

The reason why this time could be different is precisely because this technology is different. It’s a new form of cognition—we haven’t had that before. A machine that can invent more machines is something we haven’t had before. Even if we downgrade our estimate, in my book I suggest this could be the biggest thing since the arrival of the human ability to do abstract thought, which is thought to have emerged around 70,000 years ago. Now we have a second form of cognition that can do abstract thought.

But even if you think that’s exaggerated and say it’s more like the Industrial Revolution—that’s still pretty big. The Industrial Revolution brought about social and political convulsions that led to Marxism, The Communist Manifesto in 1848, and a slew of revolutions across Europe. I’m willing to be Marxian in the sense that technological change drives social and political change, and it can be fairly revolutionary and bloody. We shouldn’t be sitting here in the 21st century forgetting the lesson of the 19th: that the Industrial Revolution was highly disruptive.

Mounk: Speaking of the Industrial Revolution, I want to ask you not about P(doom), but about the probability that our society gets really screwed up—which is the other obvious analogy to the Industrial Revolution and previous technological transformations.

The first thing to say about many of these previous technological transformations is that they were in fact terrible for many people at the time. In retrospect, the Industrial Revolution is a very positive thing—humanity is thriving across a huge number of dimensions to a vastly larger extent than before it. But for about 50 years, the living standards of average people did not go up, there were huge economic disruptions, and many people whose skills were displaced suffered greatly. Certain kinds of craft skills were automated away, and people who had invested real energy into learning those skills—often to a very impressive standard—no longer had a use for them.

But there was always a reservoir of demand for the kinds of things that humans could do. If you lost your job as a peasant or a hand weaver, you or your children could attain a higher level of schooling, and move into the rapidly growing number of jobs in offices or, more broadly, involving cognitive skills. Now we’re at a point where that escape route may no longer be available. There is obviously a huge debate—from some people in Silicon Valley who think every job is going to be gone in three years, which I find very naive, to certain economists who think it’s not going to have any impact on the job market at all.

What do you think the economic impact of all of this is going to be, including on the job market? How has talking to Demis and other people in the space shaped and perhaps changed your view on this?

Mallaby: One thing to start with is that this debate has been going on inside AI circles for a long time. When DeepMind was acquired by Google, they did have one safety oversight meeting, which took place in 2015. At that meeting, Mustafa Suleyman, the co-founder of DeepMind, made the argument that the pitchforks were going to come—that people were going to be displaced from their jobs and would come after the makers of AI. Eric Schmidt, who was the CEO of Google at the time, said: no, you don’t understand economic history—when you displace some jobs, new ones get created. That familiar argument. This is not a new debate, and it’s worth recording that, because we’re still having it and we need to move to actually do something about it.

My view is that humans can retain a role in some areas of the economy and also in some areas of living. Human-to-human interactions are probably something we’re going to be better at. AI can be actors and therapists, but I think humans probably provide better quality companionship. That bleeds into things like enterprise sales—I suspect humans will retain an edge in areas where emotional intelligence and being a biological intelligence who sits down next to somebody and looks them in the eye remains powerful. Then there’s goal setting: we don’t want machines setting goals for us. Whether that’s a political leader or a volunteer organizer at a local level, goal-setting up and down society will remain a human thing. Entrepreneurship is clearly another good example.

Then there’s going to be a whole range of changes in how we spend our time, which have the character of chess. A computer defeated Garry Kasparov in 1997, so we’ve run the experiment over almost three decades of computers being better than humans at chess. The number of chess players has gone up. The number of people who watch human chess players has gone up. No human fans watch machines play against machines. Instead, human champions train with the help of machine champions, get better that way, and discover new strategies thanks to AI. People’s passion for chess and the time they spend on it has gone up, not down. I think we’re going to be quite good at discovering new hobbies, activities, ways of competing with each other, ways of getting fulfillment and finding meaning in life that are not in the traditional category of paid activity—not a job, but a passion that can keep us going.

Mounk: We do need jobs, at least in our current economic system, in order to have a livelihood. I really don’t believe the line that all jobs are going to go away—I think that’s terribly naive about both what actual jobs consist of and about all the regulatory and other obstacles to many of those substitutions. But it would be enough for two things to happen: A) that a lot of people who are now in very well-remunerated employment lose their jobs and are upset about that; and B) as a result, there is now an oversupply of highly skilled human cognitive labor, which depresses the wages even of those who do retain a job, because the number of people who could plausibly substitute for them has gone up significantly.

If you imagine that playing out for even 50 years—which would roughly equal the duration of the disruptive phase of the Industrial Revolution, let alone forever—that would have profoundly troubling consequences for our political system, for our ability to sustain social peace, and for people’s sense that institutions are working. How should we think about that?

Mallaby: A key point that more people need to understand is that you don’t need to posit that all people in some category lose their jobs. If only 20% lose their jobs, they will compete down the wages of the others, and that will create mass unhappiness. Think about COVID—unemployment spiked to something like 10 or 12 percent at some point, and that triggered an enormous fiscal response from the government. Stimulus checks were mailed out to every single American. We’re saying that even if just 12% of the workforce lost their jobs, that is politically and socially unacceptable, judging by what the government’s response was during COVID. So yes, I think it’s very troubling.

In a funny way, this is actually the reason I got to write my book. I went to Demis Hassabis at the beginning of my project and said: you may not particularly want to spend 30 hours speaking to an author so they can get deep access and write a book about you, but you don’t have a choice. First, if you’re right—as you say in all your speeches—that AI is going to be the most important technological invention in human history, it follows that you, as its creator, are one of the most important people in human history. If that’s the case, there will be a book about you. Furthermore, you should welcome a book because if you’re going to disrupt people’s lives with your technology—change the way they bring up their children, change the way they conceive of themselves as human, because there are now machines that can think—you had better explain your motives for doing this to the world, because otherwise it won’t be accepted.

If I go down the list of the leading AI lab leaders: Sam Altman at OpenAI, especially following the debacle with the Pentagon, is viewed by many as a slippery opportunist—perfectly willing to undercut safety principles in order to snag another government contract. Elon Musk has his own vainglory, and I don’t think most people would trust him. Dario Amodei does stand out for principle, but is very focused through a scientific lens. He’s a very deep scientist, very smart—but I think he sometimes underweights the difficulty of turning what he calls “a database full of geniuses” (his expression for AI) into actually using that intelligence for positive effects on economic productivity or whatever else you want to do with it. There are so many social and institutional frictions, and perhaps that is not what he thinks about because he’s such a pure scientist.

Demis Hassabis, who I wrote about, is by far the most relatable, normal, and reassuring figure in the field. It’s good that people should get a chance to understand what it’s like—it perhaps makes it easier to accept what’s going to happen.

In the rest of this conversation, Yascha and Sebastian discuss the broader economic changes triggered by AI, what would happen if OpenAI went bust, and why there isn’t an AI bubble. This part of the conversation is reserved for paying subscribers…

Listen to this episode with a 7-day free trial

Subscribe to Yascha Mounk to listen to this post and get 7 days of free access to the full post archives.