← Clippings

the-obvious-problem-that-no-one-can-agree-on

clipping

Two boxes, one choice, and $1,000,000. Sponsored by Brilliant - To learn for free on Brilliant for a full 30 days, go to https://brilliant.org/veritasium. Our viewers also get 20% off an annual Premium subscription, which gives you unlimited daily access to everything on Brilliant.

If you’re looking for a molecular modelling kit, try Snatoms, a kit I invented where the atoms snap together magnetically - https://ve42.co/SnatomsV

Sign up for the Veritasium newsletter for weekly science updates - https://ve42.co/Newsletter

▀▀▀
A huge thank you to Dr. Arif Ahmed, Dr. Adam Elga, Dr. Kenny Easwaran, Dr. Peter Slezak, Dr. David Wolpert, Dr. Scott Aaronson & Dr. Michael Huemer for their invaluable expertise and contributions to this video on Newcomb’s Paradox.

The causal expected utility calculation was based on a post by Professor Huemer here - https://ve42.co/Huemer

▀▀▀
0:00 What is Newcomb’s Paradox?
3:24 Pick 1 Box!
5:24 Pick Both!
6:38 What is decision theory?
11:27 What does Newcomb’s Paradox say about free will?
13:25 What does it mean to be rational?
16:49 Mutually Assured Destruction
20:02 Precommitment Is The Ultimate Strategy

▀▀▀
References can be found here - https://ve42.co/NewcombRefs

▀▀▀
Special thanks to our Patreon supporters:
Albert Wenger, Sam Lutfi, Michael Krugman, Sinan Taifour, Marinus Kuivenhoven, Lee Redden, Richard Sundvall, Ubiquity Ventures, David Johnston, Juan Benet, Paul Peijzel, Meekay, Evgeny Skvortsov, Blake Byers, Dave Kircher, Gnare, Anton Ragin, KeyWestr, meg noah, Tj Steyn, Orlando Bassotto, Adam Foreman, Balkrishna Heroor, Jesse Brandsoy, Garrett Mueller, Kyi, Ibby Hadeed, Bertrand Serlet, wolfee, David Tseng, Bruce, Alexander Tamas, Alex Porter, Jon Jamison, Charles Ian Norman Venn, armedtoe, Jeromy Johnson, Hayden Christensen, Robson, EJ Alexandra, Daniel Martins, Shalva Bukia, Moebiusol - Cristian, Martin Paull, Data Don, Vahe Andonians, Mark Heising, Hong Thai Le, Parsee Health, Kelcey Steele

▀▀▀
Writers: Sulli Yost & Casper Mebius
Producer & Director: Sulli Yost
Editor: Jack Saxon
Additional Editing: Peter Nelson
Camera Operators: Lukas Guderjahn, Danijela Čavlović, Casper Mebius, Gregor Čavlović & Derek Muller
Presenters: Casper Mebius, Gregor Čavlović, Henry van Dyck & Derek Muller
Animators: Emma Wright, Andrew Neet, Alex Drakoulis, Domonkos Józsa & Fabio Albertelli
Assistant Editor: James Stuart
Researcher: Darius Garewal & Gabe Strong
Thumbnail Designers: Abdullah Rabah, Ren Hurley & Ben Powell
Production Team: Jess Bishop-Laggett, Matthew Cavanagh, Anna Milkovich & Josh Pitt
Executive Producers: Casper Mebius & Derek Muller

Additional video/photos supplied by Getty Images & Storyblocks
Music from Epidemic Sound


Clipped from youtube.com on 2026-03-10T17:38:47+00:00.


Summary

This video explores Newcomb’s Paradox, a deceptively simple thought experiment that divides people almost evenly and reveals deep disagreements about rationality, free will, and decision-making. The setup: you enter a room with two boxes — one openly contains 1,000,andamysteryboxthatanearperfectsupercomputerhaseitherfilledwith1,000, and a mystery box that a near-perfect supercomputer has either filled with 1 million (if it predicted you’d take only the mystery box) or left empty (if it predicted you’d take both). The prediction was made before you entered, and the boxes are already set.

Two camps emerge with seemingly airtight arguments. “One-boxers” reason using evidential decision theory: since the supercomputer is almost always right, the expected value of taking just the mystery box far exceeds taking both, as long as the predictor’s accuracy is above 50.05%. “Two-boxers” counter with causal decision theory: since the boxes are already set, your current choice can’t change the past, and taking both boxes always nets you $1,000 more regardless of what’s inside the mystery box — a strategy known as dominance.

The paradox isn’t just an academic curiosity — it connects to profound real-world questions. The video draws a striking parallel to mutually assured destruction during the Cold War, where precommitting to a devastating retaliation (a seemingly irrational stance) is what prevents the attack in the first place. Similarly, in the game of chicken, visibly removing your steering wheel — committing to a worse outcome — forces the other player to swerve. In both Newcomb’s Paradox and MAD, the best outcome follows from credible precommitment to a seemingly worse option.

The resolution the video arrives at is that rationality may not be about making the optimal choice in any single moment, but about deciding what kind of person you want to be — what rules you would wire yourself to follow. If you commit to being the kind of person who honors ideal precommitments, you’d one-box, because that’s the precommitment that would have been best to make. This reframing turns the one-shot paradox into an iterated life strategy: just as cooperation beats defection in repeated prisoner’s dilemmas, being a principled one-boxer pays off across the many prediction-like situations life presents.

The video also raises the question of free will: if a perfect predictor exists, does that mean our choices are predetermined? Derek Muller suggests that even if free will is an illusion, society must operate as though it’s real, because the consequences of abandoning that assumption are untenable. At its core, Newcomb’s Paradox asks whether a strong correlation you know isn’t causal should still influence your decisions — a question with implications far beyond the thought experiment itself.

Transcript

There is a problem that I can’t bring up without starting a fight. No. What? It just seems so obvious to me. Now I’m all screwed up, man. It has infiltrated every single Veritasi meeting in the last 2 months. It’s trivial. I didn’t think you would fall for this side.

Just makes sense. Let’s go. Creepy. And I even argued with Derek about it. There’s no way. You’re trying to convince me. I don’t care. So, here’s the setup.

You walk into a room and there’s a supercomputer and two boxes on the table. One box is open and it’s got 1,000init.Theresnotrick.Youknowits1,000 in it. There's no trick. You know it's 1,000. The other box is a mystery box. You can’t see inside. You also know that this supercomputer is very good at predicting people. It has correctly predicted the choices of thousands of people in the exact problem you’re about to face.

Now, you don’t know what that problem is yet, but you do know that it has been correct almost every time. Now, the supercomputer says you can either take both boxes, that is the mystery box and the 1,000,oryoucanjusttakethemysterybox.So,whatsinthatmysterybox?Well,thesupercomputertellsyouthatbeforeyouwalkedintotheroom,itmadeapredictionaboutyourchoice.Ifthesupercomputerpredictedyouwouldjusttakethemysteryboxandyoudleavethe1,000, or you can just take the mystery box. So, what's in that mystery box? Well, the supercomputer tells you that before you walked into the room, it made a prediction about your choice. If the supercomputer predicted you would just take the mystery box and you'd leave the 1,000 on the table, well then it put a million dollars into the mystery box. But if the supercomputer predicted that you would take both boxes, then it put nothing in the mystery box. The supercomputer made its prediction before you knew about the problem. And it has already set up the boxes.

It’s not trying to trick you. It’s not trying to deprive you of any money. Its only goal is to make the correct prediction. So what do you do? Do you take both boxes or do you just take the mystery box? Don’t worry about how the supercomputer is making its prediction. Instead of a computer, you could think of it as a super intelligent alien, a cunning demon, or even a team of the world’s best psychologists. It really doesn’t matter who or what is making the prediction.

All you need to know is that they are extremely accurate and that they made their prediction before you walked into the room. Pause the video now if you want to think about it. Got your answer? So, I should just take two boxes. Like, obviously, I’m going to say I’m just going to take the million dollars and go with it. Of course you take two boxes. I would pick both boxes. I think I would not get the two boxes.

I think I’m taking both boxes. Let’s go. Okay, this is seeming less paradoxical than I thought because I should just go in and take the mystery box only. No. What? There are two camps. One boxers who would take only the mystery box and two boxers who take both. But as American philosopher Robert Nozick wrote, to almost everyone, it is perfectly clear and obvious what should be done.

The difficulty is that these people seem to divide almost evenly on the problem with large numbers thinking that the opposite half is just being silly. This is known as Newcomb’s paradox, named for its inventor, William Newcomb. The Guardian newspaper polled over 31,000 people about this problem. In 2016, 53.5% were one boxers and 46.5% were two boxers. Now, if you find it hard to see why anyone would pick the opposite side, well, here are the arguments for each camp. Look, I’m a reasonable guy and I like money. So, I’m going to do whatever gets me the most money. So, let’s go weigh the outcomes of both of these decisions.

First, I’m going to say that the probability that the computer predicted my decision correctly is going to be C. So, the computer got it right. And because of that, the probability it got it wrong is going to be 1 minus C. So let’s look at what happens if I tried to two box. There is a C chance of me getting 1,000andaoneminusCchanceofmegetting1,000 and a one minus C chance of me getting 1,001,000. If I add these two together, I get a weighted sum, which is going to tell me how much I can expect to get if I try to two box. This is also known as expected utility or the EU of two boxing. And I can just simplify this expression a tiny bit.

So let’s look at what happens if I try to one box. Now there’s a C chance of me getting a million dollars and there’s a 1 minus C chance of me getting nothing. So we can cancel this out. Simplify this to just million times C. If I equate these two expressions, I’m going to get the C at which these two expected utilities are equal. And it turns out that the C at which this happens is 0.5005 or 50.05%. Which means if the computer is better at predicting than what is basically random, then the expected utility of one boxing is going to be higher. Now, I know that the computer is much better at predicting than that because it accurately predicted thousands of people before me, which means I’m sticking with my one box.

Here’s my million dollars. Take that, Casper. Thank you very much. So, I should go in and take the mystery box and leave the other because I’m assuming that its prediction is pretty good — that it’s a supercomputer. Okay. So, you’re saying it’s not paradoxical because the choice is obvious. I’m very surprised.

Why? Because to me, the answer is also obvious. And to me, the answer is you take both boxes. So, here’s how I think about the problem in a way that actually makes sense. You know that the supercomputer has already set up the boxes. So, whatever I decide to do now, it doesn’t change whether there’s zero or a million dollars in that mystery box. And that gives us four possible options that I’ve written down here. If there is 0inamysterybox,thenIcouldoneboxandget0 in a mystery box, then I could one box and get 0, or I could two box and get $1,000.

But there could also be 1millioninthemysterybox.Andinthatcase,Iwouldget1 million in the mystery box. And in that case, I would get 1 million if I one box. Or I would get a million and $1,000 if I two box. So I’m always better off by picking both boxes. This is known as strategic dominance where I always pick the dominant strategy which in this case is to two box. So give me those boxes. The two boxer argument makes a lot of sense to me once you explain it. I’m like okay.

Yeah, I can see why exactly you’re right. But I can also see that just having those thoughts in your brain are what might allow the computer to give you nothing. I think I’m grateful that I just don’t have those thoughts. It’s so funny because I was totally expecting you to go two box. No way. No way, man. It seems like there are two perfectly reasonable approaches that give two completely different answers. And that’s because your choice actually reveals something fundamental about how you make decisions.

It comes down to these two statements. First, as far as you know, basically everyone who has taken one box has walked away a millionaire, and everyone who has taken two has walked away poor. Second, the supercomputer made its prediction before you even knew about the problem. The boxes are already set up, so your decision now can’t change if the million is in there or not. Both of these statements are true, but there’s a hidden assumption in each that divides people. We pick both and the computer picks both. We just get the $1,000.

I think I would just pick the mystery box. Just the mystery box, probably. Yeah, I might be taking the mystery box. Just the mystery box. Yeah, might be. Okay. Why? I don’t know.

I guess the supercomputer is right. No. Here’s the hidden assumption for us one boxers. My expected utility calculations are based on probabilities that are using prior evidence of how accurate the supercomputer is, because the thousands of people that it accurately predicted before me is evidence enough for me that when I go for one box, there’s going to be a million dollars waiting in there for me. This is based on something called evidential decision theory. And using this decision theory, you get these expected utilities and my choice is obvious from there. And it turns out a lot of you actually thought the same way.

We polled our audience, got more than 24,000 responses, and it turns out two-thirds of you are one boxers. I know it’s crazy. I don’t trust this. It’s looking to be like you’re less and less rational with these results, Casper. So, your argument is kind of that if you one box, it will have predicted that you were going to one box and you’ll walk out with the money. That is pretty convincing. Casper, I’m starting to really doubt this two-box side of things. How could you not be a two-boxer?

Gregor has this really funny way of thinking about things, but I make my decision based off something else, something a little more rational. Because I believe that whatever I do now can’t influence and change the past. I only take into account things that I can actually influence. And clearly, whatever I do now, whatever I think now is not going to change whether that million dollars is going to be in the mystery box or not because it was already set up before I learned about the problem. This is known as causal decision theory where you only take into account things that you can actually cause. And so with this, your expected utility calculation changes. And that’s because you need to use a different probability, one where you could actually cause that $1 million to be in the mystery box or not. So right before the supercomputer made its prediction, there was some probability that it thought I was going to one box.

I know it’s weird, but bear with me. So let’s say that probability is P. Then that’s the probability that I’m going to use in my expected utility calculation. And the expected utility to one box is just going to be 0 plus 1milliontimesP.Thatsprettygood.Buttheexpectedutilityfortwoboxingisgoingtobe1 million times P. That's pretty good. But the expected utility for two boxing is going to be 1,000 plus 1milliontimesP.Butthatsjustthesameastheexpectedutilityforoneboxingplusanextra1 million times P. But that's just the same as the expected utility for one boxing plus an extra 1,000. So no matter what the computer predicted, my expected utility is always higher by picking both boxes.

So of course you’re going to two box. Anyone in their right mind would pick both boxes. It’s made the prediction before you’re in there. Whether I have faith in this thing being 90%, 100% — it actually doesn’t really matter. I think you guys are imposing your will. Yes, it does. He’s cooking, bro.

Like yeah. You guys think that your decision, whatever you think now, is going to change the past. That’s called wishful thinking. Yeah. Like come on. I’m two boxes. I’m two boxes. It only makes sense.

I’m back with Casper. Exactly. Welcome to camp two boxers. I’m not losing a million dollars. It was never in the room, man. You’re going to walk into a room and there’s either money in the room or there isn’t money in the room. Your question is, do you pick it up?

Of course, we’re trying to do you a favor here. You’re not doing me a favor because your decision does not affect — you’re like, your little thought does not change God’s mind, bro. Henry, if you convince yourself that you’re a one boxer, you’ve convinced the machine. I don’t believe that. What do you mean you don’t? I don’t believe you convincing yourself should impact what the machine thinks of you, because often you’re just walking into a room and it’s already made the prediction. You can’t impact it, you know.

So, whatever you think is more important, whether that’s the evidence of the supercomputer’s accuracy or the fact that the boxes are already set up, well, that affects how you calculate the expected utility. And because both of those assumptions are true, both camps have valid answers. But if there’s no right answer, then is this just a meaningless problem? Well, not really. Because it actually reveals a surprising amount about three important questions. Does free will exist? What does it mean to be rational? And is there an ideal way to act in life?

For example, the only way you’re going to win this game is to already be the kind of person to one box, but then two box at the last second anyways. That’s the only way you’re going to get 1millionand1 million and 1,000. But some would argue that that itself is impossible. If the predictor is so good, let’s say it’s 100% accurate, then that’s not even possible. Would you say that’s true? Yeah. Then a follow-up question is, if such a perfect predictor would exist, does that mean that free will doesn’t exist? Because you’re saying there’s nothing you can do in between walking into that room and making your decision that ends up changing what was predetermined.

That’s right. And maybe this reveals where I’m coming from. And I think where I’m coming from is maybe free will doesn’t exist. I come down on this point of like free will is an illusion. But our world operates in a way that is indistinguishable from free will being real. And therefore, you have to act as though it’s real, as though it’s 100% real. Interesting. If we think that free will is not real and it’s an illusion, and then you have someone who’s committing crimes and then you want to say, well, that’s not his fault.

Therefore, instead of putting murderers in jail for 25 years, we’re just going to give them some gardening classes or something — the problem is that then changes the environment where everyone knows you can kill someone and you can go do the gardening. So, you can’t change the system based on the knowledge that it’s an illusion. Whether we do or don’t have free will, you have to live as though it exists. So, you’ve still got to make a choice. One box or two boxes. Which brings us to our second question. What does it mean to be rational? And the guy who acted rationally and doesn’t believe his thoughts can influence the past.

But your rational choice will have given you $1,000. I know. Yeah, it’s tough. It’s tough. Let’s see what I can buy with a million dollars. None of these look great. I think we can be more creative. Private island sounds pretty nice.

All right, Casper, what do you want to go buy with a thousand, man? This is known as the “why ain’t you rich” argument, which boils down to one super annoying question. If you’re so smart, then why aren’t you rich? You know, if winning is getting more money, then of course the one boxers are going to end up better off than the two boxers. But maybe it’s not about who wins, but about what’s rational. In their 1978 paper, philosophers Gibbard and Harper argue that the rational choice is to pick both boxes. Although they do admit that two boxers will fare worse. They instead say that the game is rigged.

And if someone is very good at predicting behavior and rewards predicted irrationality richly, then irrationality will be richly rewarded. But I think that’s a bit of a copout because really Newcomb’s paradox reveals something surprising — that sometimes in order to be a rational person you must act irrationally. There’s one question of what’s a rational person. There’s another question of what’s a rational act. Most of the time rational people do rational acts. Sometimes they just don’t. And I think this is analogous to the situation in the prisoner’s dilemma. In the prisoner’s dilemma, you and another player compete for money by either cooperating or defecting.

If you both cooperate, then you get three coins each. But if you defect and your opponent cooperates, then you get five coins and they get nothing. And if you both defect, you get one coin each. So no matter what your opponent does, you are always better off by defecting. But if you play this game not once, but repeatedly, then everything changes. All of a sudden, you’re better off by cooperating. What’s a rational society? A rational society is full of cooperators.

What’s a rational person? Maybe a rational person is a defector. And normally, you might expect a rational society is made up of rational people. But I think it’s familiar that rationality at one level isn’t compatible with rationality at the other level. So, while the rational act is to two box, a rational society would actually be full of one boxers. Now, I’m a fervent two boxer, but there are three ways that you can get me to one box. The first is if my choices now can actually change the past. For example, say the supercomputer makes its prediction by opening up some tiny wormhole to see the future.

Well, in that case, if I choose one box now, that actually causes the million dollars to be placed in there in the past. So, I one box. Second, if there are multiple trials, because now with every game, each of my choices builds up a reputation. So, if I one box, then I’ll be predicted as the kind of person to one box and so I get the million dollars either in this round or a later one. And third is if I can precommit. If I can talk to the computer to make my case before it makes its prediction, then I will 100% one box because staying true to my word is important to me and the supercomputer would know that. If I put my word on it, I’ll take the one box. Yeah, 100% one box.

Yeah. But there are some realistic scenarios where staying true to a worse option could have deadly consequences. On the 29th of August 1949, the Soviet Union detonated the RDS-1 bomb as part of their first nuclear weapons test. This sent the US and the USSR into a furious arms race. By the mid-1960s, the US had over 30,000 nuclear warheads and the USSR had just over 6,000. Both sides were more than capable of destroying the other. The US Secretary of Defense at the time, Robert McNamara, didn’t advocate for disarmament. Instead, he recommended a strategy of assured destruction where the US should be able to deter a deliberate nuclear attack by maintaining a highly reliable ability to inflict an unacceptable degree of damage upon any single aggressor.

This strategy eventually became known as mutually assured destruction or MAD. If either country attacked first, the other would surely retaliate and lead to total annihilation of both sides. So having that commitment to retaliate is beneficial. It stops the attack from happening in the first place. Now say you are the US president during the Cold War. You have publicly committed to retaliate if the US is ever attacked, but you’ve just received word that the Soviets have launched their missiles. It’s not a system error. It is a real attack.

So what do you do if you launch now? Then at best everyone in the US and USSR dies and at worst you get a nuclear winter that kills nearly everyone on the planet. Everyone in the whole world dies. Then I probably don’t launch. But you did pre-commit. Yep. I don’t like the outcome. No, it’s terrible.

I don’t like the outcome of everyone on Earth dying. So I’m going to just not. When the country is electing their leader, which leader do you want to elect? Do you want to elect someone who’s crazy and is always going to press that button? Or do you want to elect someone who makes the seemingly rational choice of saving people and not press that button? I think you want someone who maintains the posture of always pushing that button. That’s right. And then you want someone who secretly will not actually push that button.

There is an inherent risk there, which is that if anyone finds out, you’re exposed. There’s this other game theory interaction, the game they call chicken. You’re both driving your cars at each other. The worst thing is if neither of you swerves because then you both die. But you win if the other person swerves and you don’t. The best strategy in this game is to visibly take the steering wheel out of your car and throw it out the window so that the opponent can see that you’ve done that. Now they know you cannot swerve. You’re this mad dog that’s just going straight ahead.

And now they realize now my best action is just to swerve. And that’s similarly like what you want with the nuclear deterrent as they set up in Dr. Strangelove. In the 1963 film Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb, the Russians built the perfect doomsday device. As soon as it detects a nuclear attack or any tampering, it automatically triggers a large enough nuclear explosion to kill everyone on the planet. The tampering kill switch isn’t there to prevent enemies from disabling it. It’s to prevent the Russians themselves from having second thoughts. Now, the whole point of the device is to be so devastating and automatic that the US would never even think about launching an attack.

But this only works if everyone knows that the device exists, which is the whole point of the movie. In both Newcomb’s Paradox and MAD, the best outcome follows from a pre-commitment to a worse option. That’s what gets you at least a million dollars in the former and a tense but stable peace in the latter. It’s the commitment that’s important. So maybe being rational isn’t deciding about what to choose in the moment, but it’s about deciding what rules you’re going to live by. The question isn’t how to act. The question is what rules ought one to follow or how does one even decide what rules to follow. Sometimes it’s put in the form of if you knew that you were a robot with programming that you could set and you could rewire yourself to make yourself obey one set of rules rather than another.

The question is what sort of rules would you wire yourself to obey? And what you would do is you would make yourself into the kind of creature that sort of always acts in line with the commitments that would have been good to form had you even known about the problem. When you’re in a situation like the Newcomb case, you would end up finding yourself thinking: if I had been able to make a pre-commitment, what pre-commitment would have been the good one to make? The good pre-commitment to make would have been to be a one boxer. And since I’ve already wired myself up to be the kind of person that lives up to all the pre-commitments I would have made, then I’m already in effect committed to one boxing even though I didn’t realize it. I love this approach because for me it kind of makes it an iterated problem — but maybe more an iterated problem in life. If I don’t look at it as a single case and I sort of almost think about it as for every future predictor, for every future case, or like almost building your own reputation. Right, exactly — like you always want to live up to the commitments you’ve made, so even if you haven’t heard of it before you’d want to stick to those ideal pre-commitments so you are acting in line with the best version of yourself. Yeah, that would convince me to be a one boxer. It’s rare to convince anyone to switch on the Newcomb problem.

The thing is, even if I never run into another generous supercomputer again, life doesn’t end after I walk out of that room. Like, I should always defect in a one-shot prisoner’s dilemma because I can only gain by betraying the other player. But when I play multiple rounds of the game, like in life or in society, everything changes. Suddenly, it pays to cooperate. So being the kind of person that sticks to an ideal pre-commitment is beneficial. So maybe I was just a one boxer kind of guy all along. All it took was a little reframing and a new perspective. The core of Newcomb’s paradox is deciding if a strong correlation that you know isn’t causal should matter in your decision.