LLMs Often Conflate Images In Bizarre Ways…

Mike Caulfield

Sep 15

A short demonstration

Read →

6 Comments

Jack

Sep 15

Super cool!

I actually have no idea what the models are technically doing here…

One trick that helped Info Sleuth a lot was to add the AboutThisImage results to the context, then prompting around if we expected to see similar images or not.

Expand full comment

Reply (1)

Mike Caulfield

Sep 15

Yeah I just realized I called it "why LLMs conflate images" and I actually have no idea why, I just know that they do! A lot! In this particular circumstance, and Gemini somewhat more than others, but the why is still elusive to me.

Expand full comment

Gerben Wierda

Sep 15

I really like the down-to-earthness of your posts. In a sea of 'hype' (often nonsense of what GenAI can do or extrapolations from some ad hoc example) on one side and 'look how the tool doesn't understand' (yes, yes, true, but we already knew that, let's take that as a given, please) on the other, the realistic (and experimental) 'working with the LLMs' (warts and all) is rare and refreshing.

Expand full comment

Reply (1)

Mike Caulfield

Sep 15

Thank you! Yeah, "things are surprisingly useful but also somewhat broke, now what?" Is definitely my my problem space

Expand full comment

Reply (1)

Gerben Wierda

Sep 15Edited

That is actually quite a nice summing up. At some point I may use this one (with attribution of course).

My own ‘now what’ so far has been inspired by Brian Merchant’s comparison with the start of the ‘physical automation’ (aka Industrial Revolution) and the (misunderstood) Luddites. https://ea.rna.nl/2024/07/27/generative-ai-doesnt-copy-art-it-clones-the-artisans-cheaply/

Expand full comment

Prof. Attilio

Sep 15

Very interesting; thank you!

Expand full comment

The End(s) of Argument

LLMs Often Conflate Images In Bizarre Ways…