Fighting 502s, Writing YAMLs, and Testing the Skill Tree

008 - Behind the scenes of debugging Docker, drafting skill definitions, and validating them with a robust test suite.

Sep 23, 2025

Hey folks, welcome back!

First off — apologies that this post is a couple of days late. My weekend was derailed by a stubborn 502 error in production that I’m sure has now left me with a few grey hairs. Thankfully, I was finally able to solve the issue by untangling the port mappings and Dockerfile placements across the backend and AI engine containers.

A key lesson I took from this experience is that knowing where to look when debugging is half the battle, it saves a ton of time (and stress). Over-relying on AI for elusive production bugs like this can actually slow you down. These are the moments where you need to lean on your own technical chops first, and good old-fashioned developer forums/community support.

📝 Drafting the Skill Leaves

This week’s big milestone was getting all the YAML formulas drafted for every skill leaf from Beginner through Advanced. That’s 60+ definitions covering openings, tactics, positional play, and endgames.

I’ve left the Master tier alone for now — mostly because I don’t feel qualified (yet!) to write those formulas as a 1300-rated player. That’s something I’d rather co-design with someone who’s actually earned that title.

🧪 Building a Testing System

Of course, definitions are only half the battle. To make sure these formulas actually behave as intended, I put together a comprehensive testing framework for the skill tree.

Here’s how it works under the hood:

Skill Tree Service → loads all skill definitions from YAML and applies the scoring logic.
Acceptance Test Runner → runs through test cases for every skill across all tiers, using mock game data. Each case checks if the resulting score and reliability fall within the expected range.
Test Case Files → YAML files that simulate different scenarios (good performance, poor performance, mixed performance) for each skill.

Running the tests looks something like this:

cd skilltree/v1
python tests/run_acceptance.py

This command automatically loads all skills, executes every acceptance test, and spits out a detailed pass/fail report.

I also layered in two extra safety nets:

Basic Functionality Tests → make sure the service loads, finds skills, and can score them without error.
Monotonicity Tests → ensure that better play always leads to higher scores. (E.g., if you develop knights earlier, your “Develop Fast” score should never go down!)

Together, these tests act like guardrails — catching YAML mistakes, formula bugs, or weird regressions before they creep into production.

📊 Why This Matters

Building Rookify’s skill tree isn’t just about coding a few “if this then that” rules. It’s about creating a system that:

Responds logically and consistently to player behaviour
Feels trustworthy to users (good play rewarded, bad play penalized)
Can scale across hundreds of games and thousands of players

With this testing framework in place, I can now confidently iterate on skill definitions, expand detection logic, and eventually feed in real game data.

📈 Analytics + Compliance

On a more practical note, I also set up Google Tag Manager, GA4, and a CookieBot banner. This way, I can start measuring how early testers are engaging with the Explore feature — and do it in a GDPR-compliant way.

Free tester spots are still available 👉🏿 rookify.io/app/explore

🔮 What’s Next

Next week, the plan is to:

Stress-test the skill leaves with synthetic PGN data first,
Then run the tree against real chess games (ideally a couple hundred from each Elo level, if I can build the right data pipeline for this to work).

This is where the system will start moving from “theory” to “practice.” If all goes well, I’ll have the first real insights into how players’ strengths and weaknesses map onto the tree.

Thanks as always for following along. The deeper I go into this project, the more I realise just how much scaffolding you need to make something that feels simple and elegant on the surface.

See you next week,
– Anthon
Chief Vibes Officer @ Rookify

By the way, if you’re a chess player, I’d love to hear about your own improvement journey.

What’s been frustrating? What’s actually helped? And what kind of innovations do you wish existed in the chess world? If you’ve got 3–5 minutes to spare, please fill out this short survey—it would mean a lot and will directly help shape how Rookify evolves.

Thanks again

Vibe Coding Rookify

Discussion about this post