agent-testing

AI Agent Security Testing — 112 attacks across 14 categories. Prompt injection, jailbreaks, MCP poisoning, agency hijacking & more. Test any AI agent in 5 minutes.

Updated Feb 13, 2026
TypeScript

vysotin / agentic_evals_docs

Star

AI Agent Evaluation and Monitoring Guide

documentation monitoring best-practices guidelines agents evaluation-metrics evals agentic-ai agent-testing

Updated Jan 27, 2026

Personaz1 / prompt-qa-lab

Star

Regression and evaluation toolkit for prompt and agent output quality

python open-source regression-testing ai-evaluation llm prompt-engineering agent-testing

Updated Feb 8, 2026
Python

corradocavalli / agentic_evaluation

Star

Demonstration of testing and evaluation patterns for AI agents using Azure AI evaluation tools with custom evaluators

evaluation ai-agents azure-ai agent-framework azure-ai-evaluations agent-testing

Updated Feb 9, 2026
Python

gur3245singh / nomos

Star

🧮 Solve mathematical problems and write proofs in natural language using this easy-to-use reasoning harness. Enhance your problem-solving skills effortlessly.

java spring-boot frontend rule-engine forms declarative joi codechef ros gazebo plagiarism plagiarism-detection plagiarism-detector react-hook-form agent-testing

Updated Feb 14, 2026
Python

Improve this page

Add a description, image, and links to the agent-testing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the agent-testing topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agent-testing

Here are 12 public repositories matching this topic...

langwatch / better-agents

langwatch / scenario

dowhiledev / nomos

inkog-io / inkog

pyros-projects / agent-comparison

kimtth / agent-auto-eval-azure-aoai-sk

vksundararajan / cross-check

ClawdeRaccoon / pwnclaw

vysotin / agentic_evals_docs

Personaz1 / prompt-qa-lab

corradocavalli / agentic_evaluation

gur3245singh / nomos

Improve this page

Add this topic to your repo