A Surplus of Reasoning-the-Noun Consumes a Scarcity of Reasoning-the-Verb
Some unexpected conclusions on AI and Education
I used to be an AI skeptic. Not that long ago, in fact.
This video is the sort of thing I found compelling 10 months ago. A doctor takes a particular example of a diagnosis that is the bread and butter of being a general practitioner, plugs it into ChatGPT and ChatGPT flubs it up:
The core of the problem, he points out, is that ChatGPT can think in a tangential, verbal way, but to solve this particular diagnosis you have to think mechanistically — that is think about the chain of effects a various condition would cause not as a word association game, but as a real physical model:
The speaker puts these symptoms into ChatGPT 4, and ChatGPT says no problem, it’s COPD. And from one standpoint that makes sense — wheezing, smoker, etc. But if you look at everything that is off here (and I’m really getting this from the speaker, not my own analysis) and think mechanistically and predictively you can reason to the conclusion that what’s actually going on here is heart failure. If I get his point, the hypertension and diabetes are risk factors for diastolic dysfunction. And once you know that, you can reason your way to understanding what you’re seeing. You ask OK, let’s think about what we would see if there was diastolic heart failure, and it maps out. ChatGPT, on the other hand, is — according to this way of thinking — statistical, a pattern matcher, and so it fails.
Of Course, Newer Models Handle This Just Fine
So the speaker checked this sometime around ten months to a year ago. I decided to put the symptoms into the new ChatGPT o1 pro, which is the newer paid model. Here’s what it replied:
Most Probable Single Diagnosis
Putting all the findings together—the long history of hypertension (30 years), obesity, mild diabetes, an S4 on auscultation (indicating a stiff left ventricle), mildly elevated JVP, and progressive exertional dyspnea—the most likely unifying diagnosis is:
Heart Failure with Preserved Ejection Fraction (HFpEF)
Also called “diastolic heart failure,” HFpEF is driven by decreased compliance of the left ventricle (often due to LV hypertrophy from chronic hypertension). Patients commonly present with exertional dyspnea, elevated filling pressures (reflected in an elevated JVP), and an S4 gallop. They may wheeze on exertion (sometimes termed “cardiac asthma”). Edema may be absent in earlier stages.
Again, I’m not a doctor, but this sounds like what what the speaker was getting at — reasoning from risk factors, to symptoms, and noting link to exertion, which fits the risk factor. Bottom line — o1 Pro does the “impossible for AI” diagnosis just fine.
This sort of thing keeps happening again and again. And people are noticing. The thing I heard recently in an AI community of practice session from teachers was that the difference between last quarter and the previous ones was that before the AI stuff students were turning in was OK, but run of the mill. They could still say to the students, look, don’t make me read your boring AI assignments. The stuff that they are turning in now is, in the words of one professor, shockingly good. It’s not just producing better answers than most, it’s producing answers that are incredible.
Stop Focusing on AI Leftovers, Start Focusing on the AI Surplus
What if we’re thinking about this all wrong? What if this constant boundary drawing where we say “on this side AI, on this side us” is beside the point.
Herbert Simon had a famous observation about technology: surpluses consume. His most famous prediction was that the information age would directly produce the attention economy, something he predicted in the early 1970s. His reasoning was that a surplus of one thing produces a scarcity of something else. If you want to know what will be valuable in the future, look for what the surpluses of technology consume. Information technology consumes attention, so attention will be the scarcity.
Perhaps it won’t always be this way, but at the moment I think “surpluses consume” is a better way of think about what skills are valuable in a generative AI world than all this current thinking about what is uniquely human.
I’ve talked before about how generative AI may not be able to reason, but it produces compelling chains of reasoning. I call this “reasoning the noun” as opposed to “reasoning the verb”. The medical example above is just one instance of this, but you see this happening in many places at once. Whether the AI “reasons”, we can agree that in the past it produced bad reasoning-the-noun, and at present it produces good reasoning-the-noun.
Seeing that, the natural human reaction is to look and try to find the current human endeavors where its chains of reasoning are faulty, and say, aha!, on this beachhead we make our stand.
But what if we take a page from Simon instead? What if instead of trying to circumscribe uniquely human domains out of a natural defensiveness we start with a simpler question — if ChatGPT can provide good mechanistic reasoning-the-noun about pathology, what does that surplus consume? Not what does that surplus make irrelevant — that’s the wrong end of the stick. What is the surplus, and what does it eat up?
What I think is evident by now to anyone looking at the newer models is the cost of producing high quality chains of reasoning the noun is about to fall through the floor. And what reasoning the noun consumes is reasoning the verb. To evaluate reasoning you need to reason, and be able to reason in the forms presented.
As an example, I’ve been realizing something about my Toulminator project. This little experiment I stumbled into does surprisingly well walking through implied arguments and evidence online, and finding the weak spots, even on completely novel questions. To do that, it uses a model that millions of students have been taught in undergraduate classes and then quickly forgotten — the Toulmin method of argument analysis.
The reason why they’ve forgotten the method is it is quite time-consuming to do a Toulmin analysis of a claim. And it’s a tricky business. The core of the method — surfacing unstated assumptions that make the evidence relevant to a claim — is exactly the sort of thing that in the space of a busy day you just aren’t going to find the time to do. Consequently, Toulmin models are pretty scarce in the wild.
Now, once you have the thing all modeled out, if you know the method it’s a really useful device to poke at an argument and the evidence.
And here we hit the Simon observation: a machine that produces half decent Toulmin models makes understanding Toulmin models worth more, not less. If you can generate a Toulmin model at will, the value of being able to walk through one and think critically goes up, from roughly zero value today to something north of zero tomorrow.
That’s not the precise pattern with medicine, where there’s already lots of value to a doctor being able to think through the mechanics of pathology. But the same principle holds — the introduction of a machine that can quickly produce good chains of reasoning about pathology increases the value of people who can reason effectively about pathology. Reasoning the noun consumes reasoning the verb, and the cost of reasoning the noun is about to fall through the floor.
There are some places this does not hold true, of course, but any place where a human needs to make a decision (which is still the case for most important decisions) it’s a pretty solid principle. And it seems to me a much better starting point for thinking about education in the age of AI than gathering up the tasks that AI does not do well, and saying “this is what we’ll teach now, the leftovers of AI.” Because if this guess is right, it’s likely that the stronger AI’s reasoning capabilities in an area, the more in-demand reasoning skill in that area will be. Something to think about at least.
Hopefully you're on to something. I'm graduating with my master's in library information studies degree in May and hoping to get a late-start career as an academic librarian where one of the main job responsibilities would be to teach students information literacy. I earned my undergraduate degree in international studies right when the Great Financial Crisis hit and over the past year-and-a-half of grad school I keep having panic flashbacks of graduating to a decimated job market except this time it's because all of the administrators have decided that AI can replace us. If you're right, it should make sense to argue that academic librarianship is more important than ever, though, I suppose there's still the very real chance of another good old-fashioned market crash to cause me anxiety.
I’m going in circles thinking about this. Maybe you can help.
I understand the idea that surpluses consume. As you point out, this can lead to scarcity, e.g. of attention.
But I can’t quite figure out how you are predicting or assigning value from this phenomenon.
Is a thing valuable because the surpluses consume it? I.e. a thing is valuable because it is made scarce?
I don’t think that’s true. While some things are made more valuable because of scarcity, that is primarily because they held value to begin with. There are lots of things that are scarce but aren’t necessarily valuable. Rare diseases, antisocial behavior, useless products, etc.
It sounds like you’re saying, more particularly, that some things which do hold value, like the Toulmin model, may not be broadly or socially /valued/ because they are inaccessible, unwieldy, or too costly. If the scalable nature of AI can overcome inaccessibility, unwieldiness, or cost, this may increase their (at least social) value.
From here, I don’t know how to tie this to what’s valuable to teach or learn or do in education.
Perhaps the argument is that some things of value that were undervalued (e.g. Toulim model) due to “costliness” should still be taught in school because AI will proliferate these things in the future and… people will need to evaluate them in order to accept and best utilize them?
Anyway, if you can help me understand the how, why, and to whom of “value” in education, I’d love to hear more.