Standards for building agents, better
-
Updated
Jan 13, 2026 - TypeScript
Standards for building agents, better
Agentic testing for agentic codebases
Ship agents you can audit.
Security scanner for AI agents
Qualitative benchmark suite for evaluating AI coding agents and orchestration paradigms on realistic, complex development tasks
Agent testing automation 🤖 by simulating users 👥 and agents 🤝 with judge ⚖️(langwatch-scenario)
𝘈 𝘔𝘶𝘭𝘵𝘪-𝘈𝘨𝘦𝘯𝘵 𝘚𝘺𝘴𝘵𝘦𝘮 𝘧𝘰𝘳 𝘊𝘳𝘰𝘴𝘴-𝘊𝘩𝘦𝘤𝘬𝘪𝘯𝘨 𝘗𝘩𝘪𝘴𝘩𝘪𝘯𝘨 𝘜𝘙𝘓𝘴.
AI Agent Security Testing — 112 attacks across 14 categories. Prompt injection, jailbreaks, MCP poisoning, agency hijacking & more. Test any AI agent in 5 minutes.
AI Agent Evaluation and Monitoring Guide
Regression and evaluation toolkit for prompt and agent output quality
Demonstration of testing and evaluation patterns for AI agents using Azure AI evaluation tools with custom evaluators
🧮 Solve mathematical problems and write proofs in natural language using this easy-to-use reasoning harness. Enhance your problem-solving skills effortlessly.
Add a description, image, and links to the agent-testing topic page so that developers can more easily learn about it.
To associate your repository with the agent-testing topic, visit your repo's landing page and select "manage topics."