Rewinder: A Film-Aware Fact-Checker
Yes, I have a new superprompt, a variation on the old superprompt
Just want the prompt? It’s here.
So I was watching the incredible 4K UHD disc release of The Mummy last night, and there’s a scene in it where Brendan Fraser walks into a dark treasure room. He sees some light coming from the ceiling but it’s in the wrong place. So cool as a cucumber he shoots a mirror in the room, rotating it into place to catch the beam of sunlight, which then through a series of reflections lights up the whole room.
I wondered to myself if AI Overview could answer a related question, so I put one in and got a rather typical confabulation. In this newly invented scene by AI Overview, the shooting of the mirror is a part of a complex ruse involving vampires, illusions, a shotgun and who knows what else. I don’t see that many outright hallucinations these days but this sure seems to be one.
I think these sorts of errors are interesting, but of course it’s pretty time-consuming to break down a response like that, so over the past couple weeks I’ve been working on a detail-oriented, film-focused fact-checker that can process such responses in excruciating detail and show me what the AI got wrong. And I’ve done it. Here is my new program/prompt Rewinder’s markup of that response.
Just as importantly it can capture minor errors as well. Or maybe I should say moderate errors?
I know people think it’s weird I’ve become obsessed with answers about films, but part of the initial thing that attracted me to this problem space was responses like the one below, on one of the biggest plot points in the Academy Award nominated film Working Girl.
If you know the film but saw it ages ago, you’ll read this result you’ll think it's pretty good. It feels really solid. If you do know the film — from a long time ago — go ahead and read it. Good, right?
Of course, what you find out when you dig into it is it’s not right at all. It doesn’t have the big confabulations everyone seems to get fixated on. But it has so many small to moderate errors. Again, I know that reading error lists off an image isn’t riveting, but look at how many small errors there are in here. And how they kind of disappear because they are so close to what a fuzzy sense of the movie would be:
Here’s the errors. She isn’t left to mind the office; she’s left to housesit. She doesn’t find the proposal being prepared, she finds voice notes that indicate Katharine is starting to pursue this idea and trying to cut her (Tess) out of it. She does not impersonate Katharine; she pretends to be an executive while maintaining her own identity and name as Tess McGill, a point that is pretty core to the plot. The proposal is not being “actively pursued” with Jack Trainer (the Harrison Ford character) — the voice memos indicate Katharine’s intention to reach out to Trainer about it, but she hasn’t yet. Its McGill that makes the reach-out on the idea, which is again pretty crucial to the plot.
I know when I say this stuff many people roll their eyes like I’m a Star Trek Comic-Conner talking about how technically “warp 9 would be about 1,167 times the speed of light, not 1,000 times.” But that’s not what this is at all. Every error here would be evident to someone who has just watched the film, and every error is actually such that were the error true the plot wouldn’t actually work.
I’m fascinated by this, these deep but invisible errors that the free versions of LLMs tend to make. And when it comes to film, for a variety of reasons, it happens all the time.
Part of the reason I think they are so convincing is that there is genre bias here. Most of the things described happen this way in some other film. It’s more common to discover documents at the office than while housesitting, it’s more common that you find documents, not voice memos. The more famous comedy trope — in a more screwball version of an office comedy — is not the generic pretending to be an executive but rather impersonation of a specific executive.
Anyway, I’m giving you all my modification of Deep Background for verifying AI responses on film, which I’m calling Rewinder, because things are more amusing when you give them names, and more productive when people share. It’s suitable for running on the paid versions of Claude or ChatGPT. To use it, ask a somewhat specific question on a film of an AI:
Then paste the prompt into Claude Opus 4.5 or ChatGPT 5.1, paid versions only as this will burn through all your free tokens before its even a third of the way through loading it.
Then paste the question you asked the AI into chat with an "equals” (=) sign in front of it, hit return, and then paste the the answer you got.
It will start working. Go get a coffee if you want. When you come back it’ll be done. Here it doesn’t discover much, but a character name is spelled wrongly, the fire was not inadvertent but intentional, and it’s not with a candelabra but with a small candle on his desk. Also he doesn’t stop — it’s Evelyn and Jonathan who put out the fire.
Ok, actually, that’s quite a lot?
You just do this with any question and response you get about film. When it is done, I recommend going up to that copy button and downloading the HTML file.
After that you can read through the granular fact-checking where every single fact is checked…
Or just jump to the markup — there’s even a helpful little link at the top of the file to jump you there. The markup looks like this:
People sometimes say that the tools I produce provide too much information — but even if I don’t show the information it discovers, it has to go through the process of finding it so it doesn’t make sense to me (for my purposes) to throw it away.
If it bothers you, however, you can spin up a very simple programmatic wrapper in Node or Flask and feed the prompt to the Claude API (make sure to use the “web search” function and “stream” the response) and hide all returned output besides the markup and footnotes. If you’ve done any coding in your life with APIs this is literally a 30 minute project once you have the prompt. I’m not going to build it for you, because it should be your API key and your localhost webserver. But there aren’t really many programming projects simpler than an API wrapper.
If you can’t code, and want to do it as simply a prompt in the web interface, you’re just going to have to ignore the other output.
Prompt or code, you can then use it to explore the structure of AI error, which is what I plan to do for the next several weeks. More soon!
Can it fact-check other stuff? Of course!
Honestly the underlying prompt here, with minor modifications, can fact-check anything. The reason I am choosing film first is that films are hard to check, and if I can successfully check them other stuff is easy. Also it gives me a vast number of test prompts — I really should develop a benchmark here.














So cool! I've just shared this in the comments to Margaret Atwood's latest post, where she tangoes with the free version of Claude over a Father Brown episode:
https://margaretatwood.substack.com/p/claude-you-are-a-cutie-pie
Just to say I get only a small percent but still find this, and this person, very interesting