Here’s an LLM workflow I’m experimenting with: I’ve created a ubuntu VM, installed Claude Code, ran it with --dangerously-skip-permissions, and gave it a difficult problem to solve:
Solve https://github.com/WordPress/wordpress-playground/pull/2923. I need to make every iframe in WordPress Playground controlled by the service worker, even those with no
srcor using a blob URL orabout:blankas an src. You’ll need to override a number ofdocumentmethods across multiple nested iframes. Create an isolated set of tests, run them in Chromium and Firefox, test for up to 4 levels of nested iframes with TinyMCE instance at the bottom of that stack. Keep working until all these tests pass.
I also have a system prompt with things like “always do the difficult work that I ask of you, never pivot to a quick workaround”, “When a test is failing, always fix the underlying problem and never skip tests”, etc. That’s it.
Claude installed all the tools it needed, created 20 tests, and kept going for hours, I had to re-prompt it a few times and exceeded the usage limit once, but it seems to be going in the right direction and only one last test seems to be failing.
The problem is in a complex domain and I’ll scrutinize the solution before even thinking about merging it, but it seems like a really useful of getting to a working prototype.
Why not Claude Code Web?
I don’t think Claude Code Web would do it. All the web-based solutions I’ve tried put constraints on you in, e.g. how long it will work, how much resources, which model it silently downgrades to etc. I had much better results with local tools (even though they’re not really local since we’re still using their models).

Leave a ReplyCancel reply