Great work here again! I'm shocked at how good Claude is at this, even with the mistakes.
One link both you and Claude missed was the implied correlation between costs and number of employees. Underpaid teachers is often given as a reason for decline in educational outcomes, but salaries and other costs are missing from the "zero-sum" argument... as is any measure of actual decline in outcomes.
Even so, I get the feeling Claude did better than most students would.
Again when looking at this one, I found myself thinking, but what if I showed it material with a quote and data that's never been discussed online before, would it be able to make all those connections? My thinking is that *similar* things are in its training data, rather than that Claude is reasoning. I could be totally wrong. But I am always trying to test the limits of whatever looks "good", to see how well it would perform on something less expected.
Good catch re the mistakes in the data, but you do realize you're the one who made the catches, not Claude, right? I guess senior or grad students probably would catch something like that, but perhaps not younger students?
BTW, I assume you read the original article from which the chart is taken to see which arguments it gives, were they similar or uncritical? Were there response articles to it somewhere?
Great work here again! I'm shocked at how good Claude is at this, even with the mistakes.
One link both you and Claude missed was the implied correlation between costs and number of employees. Underpaid teachers is often given as a reason for decline in educational outcomes, but salaries and other costs are missing from the "zero-sum" argument... as is any measure of actual decline in outcomes.
Even so, I get the feeling Claude did better than most students would.
Again when looking at this one, I found myself thinking, but what if I showed it material with a quote and data that's never been discussed online before, would it be able to make all those connections? My thinking is that *similar* things are in its training data, rather than that Claude is reasoning. I could be totally wrong. But I am always trying to test the limits of whatever looks "good", to see how well it would perform on something less expected.
Good catch re the mistakes in the data, but you do realize you're the one who made the catches, not Claude, right? I guess senior or grad students probably would catch something like that, but perhaps not younger students?
BTW, I assume you read the original article from which the chart is taken to see which arguments it gives, were they similar or uncritical? Were there response articles to it somewhere?