Draft
Conversation
Document and implement per-test execution output and ensure suite headings are deduplicated. Updates include: docs/specs to describe per-test lines (passed/failed, model, tokens), tests in src/lib/it/it.test.ts that cover failed async test logging and suite-heading deduplication, change in src/lib/it/it.ts to log based on actual test result (didPass()), and refactor src/lib/output/testLogging.ts to track logged suites with a Set and reset it. These changes ensure accurate pass/fail messages and that each suite header is printed only once.
This reverts commit e63404d.
Replace per-test pass/fail lines with a unified "Finished in <N ms>" message and adjust duration formatting (adds space before "ms"). Remove didPass parameter from logging APIs and simplify logCurrentContextExecution/logTestExecution to always emit timing (and include model/token lines when present). Update CLI summary layout for aligned labels and cyan-bolded values. Bump spec example model to gpt-5.2 and update docs/specs/tests to match the new output format; remove some getFailedTestCount usage in it.ts.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request updates the test execution logging format and summary output for the CLI and test framework. The changes standardize log lines to use "Finished in" instead of "Passed in" or "Failed in", remove explicit pass/fail indicators from per-test logs, and update summary formatting to highlight only the counts in cyan and bold. The implementation and test files are updated for consistency, and related documentation/specs are revised to match the new output.
Logging and Output Format Changes
- Finished in <cyan bold N ms>instead of- ✅ Passed in <cyan bold Nms>or- ❌ Failed in <cyan bold Nms>, removing explicit pass/fail indicators. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]1 passed) in cyan and bold, not the labels (e.g.,Files,Evals), and updated formatting for alignment and clarity. [1] [2] [3] [4]Implementation Refactoring
didPassparameter and outcome calculation), simplifyinglogTestExecutionand related APIs. [1] [2] [3] [4] [5] [6] [7]Documentation and Spec Updates
Test Updates
Code Cleanup
it.ts. [1] [2]These changes ensure a more consistent and streamlined test logging experience, with clearer output and easier maintenance.