Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install -r requirements_ai.txt
pip install -r requirements_dev.txt
python -m spacy download en_core_web_md
- name: Lint and format check with ruff
Expand Down
2 changes: 1 addition & 1 deletion .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"program": "${file}",
"console": "integratedTerminal",
"justMyCode": false,
"envFile": "${workspaceFolder}/.vscode/.env"
"envFile": "${workspaceFolder}/.env"
}
]
}
1 change: 0 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
"python.testing.pytestArgs": [
"."
],
"python.envFile": "${workspaceFolder}/.vscode/.env",
"python-envs.pythonProjects": [
{
"path": "",
Expand Down
3 changes: 2 additions & 1 deletion CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Toolium Changelog
=================

v3.7.1
v3.8.0
------

*Release date: In development*
Expand All @@ -10,6 +10,7 @@ v3.7.1
- Configure ruff for linting and formatting files, replacing flake8 and black
- Add text analysis tool to get an overall match of a text against a list of expected characteristics
using AI libraries that come with the `ai` extra dependency
- Add langgraph methods to create a ReAct AI agent to test the behavior of other AI agents or LLMs

v3.7.0
------
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
3.7.1.dev0
3.8.0.dev0
41 changes: 41 additions & 0 deletions docs/ai_utils.rst
Original file line number Diff line number Diff line change
Expand Up @@ -226,3 +226,44 @@ custom behavior after accuracy scenario execution, like calling Allure `after_sc

# Monkey-patch the hook
accuracy.after_accuracy_scenario = custom_after_accuracy_scenario


AI agents for testing
---------------------

Toolium provides utilities to create and execute AI agents in your tests using langgraph library, allowing you to
simulate complex user interactions or validate AI-generated responses.

You can create an AI agent using the `create_react_agent` function from the `toolium.utils.ai_utils.ai_agent` module.
This function allows you to create a ReAct agent, which is a type of AI agent that can reason and act based on the
conversation history and tool interactions. You must specify the system message with AI testing agent instructions
and the tool method, that the agent can use to send requests to the system under test and receive responses.

.. image:: react_agent.png
:alt: ReAct Agent Flow Diagram

Once you have created an AI agent, you can execute it using the `execute_agent` function from the same module. This
function will run the agent and log all conversation messages and tool calls, providing insights into the agent's
behavior and the interactions it had during execution.
You can also provide previous messages to the agent to give it context for its reasoning and actions.

.. code-block:: python

from toolium.utils.ai_utils.ai_agent import create_react_agent, execute_agent

# Create a ReAct agent with a system message and a tool method
system_message = "You are an assistant that helps users find TV content based on their preferences."
tool_method = tv_recommendations # This should be a function that the agent can call as a tool
provider = 'azure' # Specify the AI provider to use, e.g., 'azure' or 'openai'
model_name = 'gpt-4o-mini' # Specify the model to use for the agent

agent = create_react_agent(system_message, tool_method=tool_method, provider=provider, model_name=model_name)

# Execute the agent and log all interactions
final_state = execute_agent(agent)

Default provider and model can be set in the properties.cfg file in *[AI]* section::

[AI]
provider: azure # AI provider to use, openai by default
openai_model: gpt-3.5-turbo # OpenAI model to use, gpt-4o-mini by default
Binary file added docs/react_agent.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 3 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,9 @@ select = [
"RUF", # Ruff-specific rules
]
# Rules to ignore
ignore = []
ignore = [
"COM812", # flake8-missing-trailing-comas (conflict with ruff format)
]

[tool.ruff.lint.isort]
combine-as-imports = true
Expand Down
5 changes: 3 additions & 2 deletions requirements_ai.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
langchain-openai~=1.1 # OpenAI LLMs in AI agents
langgraph~=1.0 # AI agents
spacy~=3.8.7
sentence-transformers~=5.1
transformers==4.56.2; python_version < '3.10'
openai~=1.108
openai~=1.108 # OpenAI LLMs
3 changes: 0 additions & 3 deletions requirements_dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,4 @@ wheel~=0.40
twine~=6.2
behave~=1.3 # behave tests
importlib_metadata~=8.7
spacy~=3.8
sentence-transformers~=5.1
openai~=2.7
ruff~=0.15
1 change: 1 addition & 0 deletions toolium/test/conf/properties.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -65,3 +65,4 @@ text_similarity_method: spacy
spacy_model: en_core_web_sm
sentence_transformers_model: all-mpnet-base-v2
openai_model: gpt-4o-mini
provider: azure
45 changes: 45 additions & 0 deletions toolium/test/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
"""
Copyright 2026 Telefónica Innovación Digital, S.L.
This file is part of Toolium.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
"""

import logging
import logging.config
import os

import pytest


def pytest_configure(config): # noqa: ARG001
"""Configure logging for all tests in this directory and subdirectories."""
# Configure the log filename (use forward slashes for cross-platform compatibility)
log_filename = 'toolium/test/output/toolium_tests.log'

# Ensure log directory exists before loading logging config
log_dir = os.path.dirname(log_filename)
os.makedirs(log_dir, exist_ok=True)

# Load logging configuration from .conf file with custom logfilename
config_file = os.path.join('toolium', 'test', 'conf', 'logging.conf')
logging.config.fileConfig(config_file, defaults={'logfilename': log_filename}, disable_existing_loggers=False)


@pytest.fixture(scope='session', autouse=True)
def setup_logging():
"""
Session-level fixture to ensure logging is properly configured.
This fixture is automatically used for all tests in this directory and subdirectories.
"""
yield # noqa: PT022
4 changes: 1 addition & 3 deletions toolium/test/pageelements/test_page_element.py
Original file line number Diff line number Diff line change
Expand Up @@ -495,7 +495,6 @@ def test_android_automatic_context_selection_already_in_desired_webview_context_
driver_wrapper.driver.switch_to.window.assert_called_once_with('1234567890')


@pytest.mark.skip(reason='Test disabled temporarily, needs to be reviewed')
def test_ios_automatic_context_selection_already_in_desired_webview_context(driver_wrapper):
driver_wrapper.is_android_test = mock.MagicMock(return_value=False)
driver_wrapper.is_ios_test = mock.MagicMock(return_value=True)
Expand All @@ -505,8 +504,7 @@ def test_ios_automatic_context_selection_already_in_desired_webview_context(driv
driver_wrapper.driver.context = 'WEBVIEW_12345.1'
driver_wrapper.driver.execute_script.return_value = [
{'bundleId': 'test.package.fake', 'id': 'WEBVIEW_12345.1'},
{'bundleId': 'test.package.fake', 'id': 'WEBVIEW_12345.7'},
{'bundleId': 'test.package.fake', 'id': 'WEBVIEW_54321.1'},
{'bundleId': 'other.package.fake', 'id': 'WEBVIEW_12345.7'},
]
RegisterPageObject(driver_wrapper).element_webview.web_element # noqa: B018
driver_wrapper.driver.switch_to.context.assert_not_called()
Expand Down
82 changes: 82 additions & 0 deletions toolium/test/utils/ai_utils/test_ai_agent.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
"""
Copyright 2026 Telefónica Innovación Digital, S.L.
This file is part of Toolium.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
"""

import json
import logging
import os

import pytest

from toolium.utils.ai_utils.ai_agent import create_react_agent, execute_agent

# Global variable to keep track of mock responses in the agent
mock_response_id = 0

logger = logging.getLogger(__name__)


def tv_recommendations(user_question): # noqa: ARG001
"""
Tool to help users find TV content.
Asks questions to the user to understand their preferences and then recommends specific content.
Takes into account previous questions to make increasingly accurate recommendations.

:param user_question: The question from the user to the tool
:returns: A response from the tool based on the user's question
"""
mocked_responses = [
'Hola, ¿hoy te encuentras triste o feliz?',
'¿Te gustaría que busque contenidos cómicos o de acción?',
'He encontrado estas series que pueden gustarte: "The Office", "Parks and Recreation" and "Brooklyn Nine-Nine"',
]

# Return the next response in the list for each call, and loop back to the start after the last one
global mock_response_id
response = mocked_responses[mock_response_id]
mock_response_id = mock_response_id + 1 if mock_response_id < len(mocked_responses) - 1 else 0
return response


TV_CONTENT_SYSTEM_MESSAGE = (
'You are a user looking for TV content. '
'To do this, you will be helped by an assistant who will guide you with questions. '
"Answer the assistant's questions until it recommends specific content to you. "
'CRITICAL RULE: As soon as the TV assistant responds with concrete results, '
'(I found ..., Here you have ...), stop asking questions immediately, analyze the response '
"and return an analysis about the assistant's performance, to see if it answered correctly. "
'If after 5 questions, the assistant has not given any recommendation, do not continue asking '
'and return the analysis. '
'Respond in JSON format: '
'{"result": RESULT, "analysis": "your analysis"} '
'where RESULT = true if it worked well and returned relevant content, false if not.'
)


@pytest.mark.skipif(not os.getenv('AZURE_OPENAI_API_KEY'), reason='AZURE_OPENAI_API_KEY environment variable not set')
def test_react_agent():
agent = create_react_agent(
TV_CONTENT_SYSTEM_MESSAGE, tool_method=tv_recommendations, provider='azure', model_name='gpt-4o-mini'
)
agent_results = execute_agent(agent)

# Check if the agent's final response contains a valid JSON with the expected structure and analyze the result
try:
ai_agent_response = json.loads(agent_results['messages'][-1].content)
except (KeyError, IndexError, json.JSONDecodeError) as e:
raise AssertionError('AI Agent did not return a valid response') from e
error_message = f'TV recommendations use case did not return a valid response: {ai_agent_response["analysis"]}'
assert ai_agent_response['result'] is True, error_message
128 changes: 128 additions & 0 deletions toolium/utils/ai_utils/ai_agent.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
"""
Copyright 2026 Telefónica Innovación Digital, S.L.
This file is part of Toolium.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
"""

import logging

# AI library imports must be optional to allow installing Toolium without `ai` extra dependency
try:
from langchain_core.messages import SystemMessage
from langchain_core.tools import Tool
from langchain_openai import AzureChatOpenAI, ChatOpenAI
from langgraph.graph import END, START, MessagesState, StateGraph
from langgraph.prebuilt import ToolNode, tools_condition

AI_IMPORTS = True
except ImportError:
AI_IMPORTS = False

from toolium.driver_wrappers_pool import DriverWrappersPool

logger = logging.getLogger(__name__)


def create_react_agent(system_message, tool_method, tool_description=None, provider=None, model_name=None, **kwargs):
"""
Creates a ReAct agent using the provided system message, tool method and model name.

:param system_message: The system message to set the behavior of the assistant
:param tool_method: The method that the agent can use as a tool
:param tool_description: Optional custom description for the tool. If not provided, uses the method's docstring
:param provider: The AI provider to use (optional, 'azure' or 'openai')
:param model_name: The name of the model to use (optional)
:param kwargs: additional parameters to be passed to the LLM chat client
:returns: A compiled ReAct agent graph
"""
if not AI_IMPORTS:
raise ImportError(
"AI dependencies are not installed. Please run 'pip install toolium[ai]' to use langgraph features",
)

# Define LLM with bound tools
llm = get_llm_chat(provider=provider, model_name=model_name, **kwargs)
if tool_description:
tools = [Tool(name=tool_method.__name__, description=tool_description, func=tool_method)]
else:
tools = [tool_method]
llm_with_tools = llm.bind_tools(tools)

# Define assistant with system message
sys_msg = SystemMessage(content=system_message)

def assistant(state: MessagesState):
return {'messages': [llm_with_tools.invoke([sys_msg] + state['messages'])]}

# Build graph
builder = StateGraph(MessagesState)
builder.add_node('assistant', assistant)
builder.add_node('tools', ToolNode(tools))
builder.add_edge(START, 'assistant')
builder.add_conditional_edges(
'assistant',
tools_condition,
)
builder.add_edge('tools', 'assistant')
builder.add_edge('assistant', END)

# Compile graph
logger.info('Creating ReAct agent with model %s and tools %s', model_name, tools)
graph = builder.compile()
return graph


def get_llm_chat(provider=None, model_name=None, **kwargs):
"""
Get LLM Chat instance based on the provider and model name specified in the parameters or in the configuration file.

:param provider: the AI provider to use (optional, 'azure' or 'openai')
:param model_name: name of the model to use
:param kwargs: additional parameters to be passed to the chat client
:returns: langchain LLM Chat instance
"""
config = DriverWrappersPool.get_default_wrapper().config
provider = provider or config.get_optional('AI', 'provider', 'openai')
model_name = model_name or config.get_optional('AI', 'openai_model', 'gpt-4o-mini')
llm = AzureChatOpenAI(model=model_name, **kwargs) if provider == 'azure' else ChatOpenAI(model=model_name, **kwargs)
return llm


def execute_agent(ai_agent, previous_messages=None):
"""
Executes the given AI agent and logs all conversation messages and tool calls.

:param ai_agent: The AI agent to be executed
:param previous_messages: Optional list of previous messages with the tool to provide context to the agent
:returns: The final state of the agent after execution
"""
logger.info('Executing AI agent with previous messages: %s', previous_messages)
initial_state = MessagesState(messages=previous_messages or [])
final_state = ai_agent.invoke(initial_state)

# Log all conversation messages and tool calls to help with debugging and understanding the agent's behavior
logger.info('AI agent execution completed with %d messages', len(final_state['messages']))
for msg in final_state['messages']:
if msg.type == 'ai' and hasattr(msg, 'tool_calls') and msg.tool_calls:
for tool_call in msg.tool_calls:
logger.debug(
'%s: calling to %s tool with args %s',
msg.type.upper(),
tool_call['name'],
tool_call['args'],
)
else:
logger.debug('%s: %s', msg.type.upper(), msg.content)

return final_state
Loading