An LLM workflow for solving complex coding problems

Posted by

on

Here’s an LLM workflow I’m experimenting with: I’ve created a ubuntu VM, installed Claude Code, ran it with --dangerously-skip-permissions, and gave it a difficult problem to solve:

Solve https://github.com/WordPress/wordpress-playground/pull/2923. I need to make every iframe in WordPress Playground controlled by the service worker, even those with no src or using a blob URL or about:blank as an src. You’ll need to override a number of document methods across multiple nested iframes. Create an isolated set of tests, run them in Chromium and Firefox, test for up to 4 levels of nested iframes with TinyMCE instance at the bottom of that stack. Keep working until all these tests pass.

I also have a system prompt with things like “always do the difficult work that I ask of you, never pivot to a quick workaround”, “When a test is failing, always fix the underlying problem and never skip tests”, etc. That’s it.

Claude installed all the tools it needed, created 20 tests, and kept going for hours, I had to re-prompt it a few times and exceeded the usage limit once, but it seems to be going in the right direction and only one last test seems to be failing.

The problem is in a complex domain and I’ll scrutinize the solution before even thinking about merging it, but it seems like a really useful of getting to a working prototype.

Why not Claude Code Web?

I don’t think Claude Code Web would do it. All the web-based solutions I’ve tried put constraints on you in, e.g. how long it will work, how much resources, which model it silently downgrades to etc. I had much better results with local tools (even though they’re not really local since we’re still using their models).

Screenshot of a code editor displaying JavaScript code related to iframe handling in a WordPress Playground project on an Ubuntu VM.

Discover more from Adam's Perspective

Subscribe now to keep reading and get access to the full archive.

Continue reading

Exit mobile version