All writingAI
Eval-driven development for LLM features
Treat prompts like code: write the test first, diff the scores, and stop shipping on vibes.
Feb 20268 min read
evalsRAG
Full article coming soon. Check back later or reach out if you want a preview.