Writing tests with Metaphor

Over the last few months, quite a few people have asked me whether it's possible to get an AI to build good tests for software. I've always told them LLMs can do a pretty amazing job, but you need to be very clear about what you want them to do. I've also been asked how I manage to get large amounts of working code from Metaphor prompts.

I figured a good example might go a long way!

Over the last month or so, I've built a new Markdown parser (abstract syntax tree builder). I needed to add some tests, so I recorded a video of me adding them. All done using a couple of Metaphor prompts and Humbug.

Why Humbug has its own Markdown parser

You might be asking why Humbug has a special Markdown parser. After all, there are lots of good open-source ones around. The answer is Humbug has a few unusual requirements:

There are a lot of parsing capabilities inside Humbug and I want them to all work in a consistent way to make the code easier to understand.
Most Markdown parsers assume a complete Markdown file, but Humbug has to deal with streaming responses from LLMs and that means we can end up with contents that don't make sense until more data arrives. Humbug has to handle that gracefully.
Markdown doesn't have a very clean syntax and has some interesting quirks. One important one for Humbug is around the handling of code fence blocks (denoted by 3 backticks). Humbug needs to handle the scenario where a code fence appears inside a code block (e.g. in a multi-line string or comment block).

Building tests with AI assistance

The video has two halves. The first walks through setting up the original test design and shows how to have an LLM build something new with some constraints. The second shows the original tests being enhanced.

At the end there's 90%+ test coverage and about 1400 lines of commented tests and test support.

Writing tests with Metaphor demonstration

Key takeaways

This demonstration shows several important principles for successful AI-assisted development:

Clear constraints and requirements lead to better AI output. By being specific about what the tests needed to do, the AI could generate appropriate test cases.
Metaphor's structured approach helps maintain consistency across different AI interactions, making it easier to build on previous work.
Iterative enhancement works well with AI assistance. Starting with a solid foundation and then building on it produces better results than trying to create everything at once.
Good test coverage is achievable with AI assistance when you provide proper context and clear expectations.

What's next?

This example demonstrates how Metaphor and Humbug can work together to produce substantial, high-quality code with AI assistance. The ability to generate comprehensive test suites quickly and reliably is a significant productivity multiplier for any development team.

If you're interested in trying this approach yourself, check out our getting started guide and join us on Discord to share your experiences and learn from others in the community.