Today I’d like to talk about a particular problem: What to do when there is just way more information out there than we can possibly hope to deal with, most of it wrong, but also we still need to learn true things. I’ll run through some informative examples, and talk about how I think this often works in practice.
Oh no not again
If you are a famous mathematician,2 chances are your inbox is full of cranks telling you they've figured out how to square the circle or trisect an angle. You might also have people send you their elementary proofs of Fermat's last theorem or that P=NP. If you're a physicist you'll get perpetual motion machines and proofs that Einstein was wrong about everything, if you're a medievalist you'll get translations of the Voynich manuscript.
Some of these have a particularly distinctive character, which is that you know that they're wrong without reading them, because the things they are claiming to have done have been proven impossible. You can't square a circle or trisect an angle, you can't build a perpetual motion machine, and while relativity is probably incomplete there's really rather a lot of experimental data that you have to explain and that most of these “refutations” cannot. To the degree that we can know anything, we know the claims being made are wrong, no matter how good the argument in favour of them might seem.
For a more down to earth example, it's like someone sending you a long essay demonstrating that the sky is green and the moon is made of cheese. It doesn't matter how good their argument is, the sky is not green and the moon is not made of cheese, and as a result you know that the argument is wrong even without reading the argument.
The others are different: The argument might be right! Fermat's last theorem might have an elementary proof, P might equal NP, and plausibly the Voynich manuscript can be translated (I personally favour the theory that it's an elaborate hoax, but that's more because it amuses me than because I have strong evidence of this).
The “Oh, come on” Test
This is important because figuring out exactly why the argument is wrong might be quite hard. It might, for example, be fifty pages of dense text in complex technical language. There's guaranteed to be an error in there somewhere, but it might be very hard to find, and you do need to find it to know that the argument is wrong, because the conclusion is wrong and that is sufficient evidence of error.
But come on, really. You know full well that these people are cranks. It's not that it's literally impossible that what they are saying is true, but it's close enough that you might as well treat it as such and ignore their email (unless it would amuse you to figure out why it's false). Regardless of the truth value of their narrow conclusion (e.g. Fermat's Last Theorem) their full conclusion (e.g. “I have proven Fermat's Last Theorem”) is so wildly unlikely that you might as well assume that it's false, and from there you don't need to read the argument to conclude that it's in error, it just is.
So why spend time trying to find the flaw at all?
This leads to what I call the “Oh come on” test. If some conclusion, or claim, makes you roll your eyes and go “Oh come on”, you’re probably fine to just treat it as false, you don’t have to spend your time looking for the error.
The logical conclusion of the “oh come on” test is that we should embrace confirmation bias and never entertain the possibility of ever being wrong, but oh come on, that’s obviously not what we should do. Instead, we should think of it in terms of load management.
I used the example of people who have said they have proved Fermat’s Last Theorem in an elementary manner as someone who has failed the “oh come on” test, but suppose your close personal friend and confidant Andrew Wiles comes to you and says sheepishly “So, I was thinking about my original 129 page proof of Taniyama–Shimura conjecture for semi-stable elliptic curves3, and I had a bit of an epiphany and this is really embarrassing but it turns out there is an elementary proof of Fermat’s Last Theorem that takes only three pages, would you mind helping me check it out to see if I’ve done something stupid?”, what do you say to him? It’s probably not “Oh come on, you can’t possibly have done that. Stop wasting my time”, right? It’s probably much closer to “Sure come right in, let me clear my calendar for the rest of the day and we’ll get to the bottom of this.”
What’s the difference in this situation?
Well, primarily the difference in this situation is that he’s Andrew Wiles. You have a way higher expectation of competence that you do for some random internet crank. He’s worth your time.
Well, you don’t have to pay attention to him, but it’s not obvious that you should treat it the same way as you do proofs of Fermat’s Last Theorem, for the simple reason that this is a new one to you. You probably don’t get many proofs of the Gaussian Correlation Inequality in your inbox, so maybe it’s at least worth a cursory glance. Thomas Royen is an unknown to you, but the situation is a unique one.
A good rule of thumb is this: You should, on average, spend no more than an amount of time per event that you can afford to spend on every “equivalent” event, where equivalence is judged based on some salient details, for the results that doing so would give you.
All your Fermat Last Theorem cranks are equivalent, and you get a lot of them, and they’re within rounding error all wrong, so there is no point in spending time on them. In contrast, Andrew Wiles and Thomas Royen are in interesting enough categories that they might be worth your time.
I love thinking about running hiring pipelines as a model for decision making.5 They contain so much interesting and relevant detail that is important to consider for the world at large.
A typical hiring pipeline typically consists of multiple stages.6 e.g. it might look like:
You read their CV and cover letter.
You have a phone screen.
You invite them in for an interview.
Success in any of the earlier stages in the pipeline translates to making it through to the next stage. If your CV looks good, you get invited for a phone screen, if your phone screen was good you might get invited in for the formal interview.
The formal interview is the actual decision making process, with the earlier parts having minimal impact on the final result, except in the sense that they decide whether you get to be the sort of person who is even considered as a candidate for it.
The “Oh come on” test is best considered as analogous to the CV screen: It’s the thing you do before wasting time on anything more involved.
Why do we structure hiring pipelines this way? Well, it’s load management again. You get too many CVs to interview them all, so you need to spend a certain amount of time triaging CVs down to filter out people you’re pretty sure will fail the final interview: People who don’t have the right experience, people whose history as a war criminal would make you uncomfortable with working for them, people who have misunderstood what the job is, etc. Then you do a phone screen that checks for the people who about five minutes in to the interview you are probably going to regret having to spend the next two hours with. Then you interview them.
These tests are never perfect, and you’ll always have some loss at each stage where you filter out someone who if you’d spent longer on them you’d have let them through, but such losses are inevitable. You can never get the error rate to zero in a bounded amount of effort, all you can do is your best with the resources available to you.
One problem with this solution is that you can never easily test whether your first pass filters are rejecting too many people. If they let someone through that they shouldn’t, the next pass will catch that. If they reject someone that they shouldn’t have, you will never know. You only have access to one of the error rates (cf. There’s no single error rate), and so you end up in a situation analogous to the situation I describe in You should try bad things, where your tastes can only narrow over time.
A solution I have proposed but never seen implemented in practice is that you should let through some random subset of each of the early stages of the interviewing (you could also e.g. do affirmative action for early stages if you’re worried about bias in your pipeline), so that you’re still doing load management but you have some group of people who are there partly to test the next stage of the pipeline. You will treat them like any other, and still hire them if they pass the interview of course, but they have the double benefit of helping you learn where your early filters are too strict.
Distributed Plausibility Checking
The problem with this strategy of filtering your information consumption with a quick plausibility test is that it would not likely stop you from being forewarned of the COVID-19 pandemic.
If you looked at the early media reports of it, it was all a bit eye roll inducing. Honestly I generally barely read the news in the first place - it rarely improves my understanding of the world and just makes me angry and sad - so it’s easy to imagine a scenario in which basically all of this information is filtered out until it’s too late.
But I think the seeds of the solution to this are contained in the above examples.
Despite my aggressive filtering out of news, I was more on top of the pandemic than most people I knew, and was reasonably well prepared for it. I wasn’t months ahead of the curve, but I was enough ahead of the curve to result in the following exchange on flatmate WhatsApp:
[18:34, 10/03/2020] (Flatmate): London is more or less out of toilet paper, and we have about a week left if we start conserving...
[18:34, 10/03/2020] (Flatmate): Can I ask that we all keep an eye out for toilet paper when were out?
[18:37, 10/03/2020] David MacIver: Are you counting the two packs in the big cupboard in that estimate?
[18:39, 10/03/2020] David MacIver: (I anticipated this scenario about two weeks ago so bought extra)
How was I well prepared when a pandemic would have failed the “oh come on” test? Well, I let other people do the work.
People whose opinion I trusted enough to listen to were more on the ball regarding the pandemic than me, and so when they started tweeting about how maybe this virus we were hearing about in Wuhan was going to be something of a big deal for 2020 I, eventually, listened, checked in on whether I thought they were right, and concluded that it was at least worth taking seriously enough to prepare.
We live in a world where too much information is coming in to possibly attempt to understand all of it ourselves, so we filter most of it out at the plausibility level before ever really looking into it. Fortunately, we each have slightly different plausibility filters, and we all have some degree of randomness in our lives that causes things to slip through the filters anyway - someone goes “huh, that’s funny” at something they’d normally ignore and decides to look into it, someone is particularly interested in Wuhan because they have family there, etc.
Once the information has crossed their plausibility threshold and they’ve decided that no actually this checks out, their friends are now in the situation where they have someone they trust acting as a source of credibility that this is actually worth looking into, and they can look into it themselves (or just trust their friends on the subject, which is extremely common and sometimes even the right thing to do). As a result, the signal of plausibility which starts in a small number of people spreads outwards along networks of trust, passing from person to person like some sort of… Hmm, I’m sure there’s an appropriate metaphor here but it’s not quite coming to me, sorry.
I mentioned in my last email that I’d like to get paid subscriber counts up by an order of magnitude. For disclosure, there are currently 22 (I love you all), and I’d like there to be somewhere in the region of 200-300 (more would be better of course, but this seems like a reasonable goal).
I’ve no plans to stop or reduce the frequency of the free content for now, don’t worry. My current plans are:
Start writing actual paid content. I’m going to try to write a second email per week for paid subscribers only, with a bit more exploratory content that shows more of my working than necessarily makes it into the publicly available version. It will take a little while to get the style and pacing of these right, so they may not be on a super regular schedule initially, but the first one should hopefully be going out this weekend. So if you’re interested and can easily afford it, please do consider a paid subscription.
Try to get the number of free subscribers up. I’d like to do this anyway, just on the grounds that presumably this stuff is actually useful to people so more people reading it is a social good, but also free subscribers are presumably the people most likely to become paid subscribers. Right now my marketing strategy mostly involves tweeting about the newsletter content a bit more aggressively. I’d also appreciate it if you helped by telling everyone about how great my newsletter is of course (or, more realistically, sharing links to anything you particularly liked).
This footnote is just here to see if you’re paying attention. I’ve recently discovered that you can add footnotes in substack and I’m, unfortunately for you, very fond of footnotes. You can safely ignore them, they’re just random asides as I write.
I am not a famous mathematician, I am at best a moderately well known person with a maths degree, but I’ve heard this from famous mathematicians.
This is the actual result Wiles proved. The fact that it implies Fermat’s Last Theorem is an earlier result, which is what lead to Wiles being interested in it.
If you’re not a mathematician this example probably won’t make much sense to you. Backstory: The Gaussian Correlation Inequality is a highly technical conjecture that you don’t care about. It went unproven for about 50 years until Thomas Royen, a retired mathematics professor who like you almost nobody had ever heard of, published a badly formatted proof of it in an obscure journal that nobody read. It took about three years for the community at large to notice this, and it was then reworked in collaboration with someone else and published in an actually good journal.
I’m less keen on actually running hiring pipelines.
Each of these stages may itself consist of multiple stages. e.g. when reading papers I often suggest that you spend 30 seconds to decide whether to spend 5 minutes to decide whether to read the paper. CVs are like that too.