Design evaluation suites with test cases, grading rubrics, and metrics for AI systems
e.g., Customer support chatbot, Code review assistant
List categories of test cases, e.g.: - Happy path queries - Edge cases and ambiguous inputs - Adversarial prompts - Multi-turn conversations
This is a Pro template. Upgrade to access.