Can a Language Model Paint?

May 14, 2026 · ai · Source ↗

TLDR

Builder iteratively prompts VLMs to paint stroke-by-stroke using Claude Opus/Sonnet and Mistral Large, testing whether process-driven generation feels more artistically sincere than one-shot output.

Frontier models (Claude Opus 4.6/4.7) produce recognizable images; smaller or older models like Mistral Large trend toward abstract scribbles unless given 50-stroke batches instead of 5.
VLMs fail at iterative fine-detail: a single bad stroke triggers cascading destruction, and the model then makes increasingly destructive repair attempts.
This mirrors LLM-assisted codebases: broad-stroke output is competent, but iterative fine edits near capability limits degrade the whole structure irreversibly.
The CLI app passes current canvas plus concept to a VLM loop; the model reasons per stroke and self-terminates when it judges the painting complete or hits a max stroke limit.
Output is still described by the author as “soulless derivative digital illustration” – iterative process did not produce the sincerity the experiment sought.

Discussion is thin; the main concrete observation from commenters is that stroke-by-stroke LLM output looks more human-made than diffusion model output.
The imitation-vs-creation framing surfaced briefly but was not developed with technical depth.
A related resource was flagged: Simon Willison’s pelican-on-bicycle LLM progress tracker and Sam Collins’s “underdrawings” post on accurate text generation are relevant prior art.

@bizer: notes iterative paintings look more human-made than diffusion output – a concrete perceptual distinction worth testing further.
@baCist: “LLMs can draw… but they imitate, not create” – sharp framing of the ceiling the experiment hits.