- Explore, Plan, Execute
- Phase 1: The Flawless (and Flawed) Plan
- Phase 2: The $10 False Start
- The Pivot: Verifiability is Key
- The AI Team: AMP Code + Devin
- The Code & The Cost
- Switching Tools: AMP vs. Windsurf
- What’s Next: Full Automation
- Final Lessons
I have a simple, annoying problem. I make a lot of blunders in my chess games. I also love using Chessable to drill puzzles. But there was no easy way to take my specific blunders and turn them into puzzles I could practice. So, I decided to build a tool to do it. My goal: a script that could pull my Lichess games, extract the blunder-turned-puzzle, and format it for Chessable. I decided to try this using the “AI Coding Accelerator” methodology (Explore, Plan, Execute) and a few AI agents I’d been wanting to test. The process involved several missteps, a $50 credit, and ultimately, a functional tool.
Explore, Plan, Execute
Here is the overview of those 3 phases:
Explore
- Understand the problem
- Gather context
- Explore approaches
- Fill context window
Plan
- Create blueprint
- Step-by-step instructions
- Testing strategy
- Success criteria
Build
- Build one step at a time
- Commit incrementally
- Monitor progress
- Update the plan
Phase 1: The Flawless (and Flawed) Plan
My first “Explore” phase seemed simple. I used Gemini Deep Research to research the Lichess API for getting my annotated games. I found an API, it looked right, and I confidently built my plan.
The problem? I had planned to use the export your imported games API. As I’d later discover, the one I actually needed was the export your bookmark games API.
Phase 2: The $10 False Start
With my bad plan in hand, I moved to “Execute.” I fed my plan to the AMP code agent and asked it to build the tool. It produced an implementation, complete with a streaming parser, but its initial streaming implementation was buggy, and the API itself didn’t yield the required data.
This first attempt cost me about $10. The lesson was immediate and painful: Garbage in, garbage out. An AI agent executing a bad plan perfectly still leads to a bad outcome.
The Pivot: Verifiability is Key
I scrapped the first attempt and went back to the drawing board. I found the correct API (export your bookmark games) and re-wrote my plan.
But this time, I focused on “verifiability.” Before I wrote a single line of puzzle-extraction logic, I had the agent set up the full project scaffolding: CI/CD, linters, and type checks. This was a game-changer. It allowed the agent to “self-heal” its own code and, more importantly, gave me a clear “pass/fail” signal.
The AI Team: AMP Code + Devin
This is where the workflow got really interesting.
- AMP Code: I used this as my primary “coder.” I’d give it a task (e.g., “implement the puzzle extractor”), and it would work on it and create a pull request.
- Devin: I brought in Devin as my “AI review agent.” The code generated by AMP was… large. One PR was ~800 lines. Having Devin do a first pass to analyze the PR for bugs and performance issues before I even looked at it was incredibly helpful.
The Code & The Cost
This successful implementation cost about $42, which included the initial $10, and resulted in a working tool.
This isn’t just a story; it’s a real project. You can see the (admittedly messy) code, the AI-generated PRs, and the review process over at the GitHub repo: chess-puzzle-extractor.
Switching Tools: AMP vs. Windsurf
That $42 bill made me pause. For the remaining work, I decided to switch to Windsurf. It’s definitely cheaper, but I found the experience less smooth. It required more manual intervention and hand-holding to get things like tests to run properly, but it eventually got the job done.
A Hidden Gem: Devin’s “Deep Wiki”
One of the coolest things I discovered wasn’t even part of the main coding. I wanted to understand how Lichess itself implements its “LEARN FROM YOUR MISTAKES” feature. I used Devin’s “Ask Devin” and “Deep Wiki” features and pointed it at the massive Lichess codebase (lichess-org/lila).
It was amazing. It dug through the repo and came back with a clear, detailed explanation of how the feature works, complete with code references. As a tool for understanding a huge, unfamiliar codebase, this was invaluable.
What’s Next: Full Automation
The tool works, but it’s still manual. The goal is to make it fully automated, similar to the Chessable’s “Puzzle Connect” feature.
The next step is to have this tool automatically pull my games from Lichess and Chess.com right after I play, request the computer evaluation, and generate the new puzzles.
Final Lessons
So, what did I learn?
- Start Small: Don’t try to make the agent build everything at once.
- The Plan is Everything: A detailed and correct plan is the most critical part. An AI agent can’t read your mind, and it can’t save a bad plan.
- Build for Self-Healing: Linters, tests, and CI/CD aren’t “nice to haves”; they are essential for managing AI-generated code.
- You Are Still the Architect: You have to review the code, guide the process, and, most importantly, know what you want to build.
Even with the blunders (both mine and the AI’s), it’s pretty amazing that I could build a 1000-line project that solves a real problem for me in just a few fragmented hours.