Conversation
…umerators - filterAsync: replace asyncSeq-builder with OptimizedFilterAsyncEnumerator, avoiding AsyncGenerator allocation and generator-chain dispatch per element. - chooseAsync (non-AsyncSeqOp path): replace asyncSeq-builder with OptimizedChooseAsyncEnumerator for the same reason. - foldAsync (non-AsyncSeqOp path): replace scanAsync+lastOrDefault composition with a direct loop, eliminating the intermediate async sequence and its generator machinery entirely. - Add AsyncSeqFilterChooseFoldBenchmarks and AsyncSeqPipelineBenchmarks to measure the affected operations and catch future regressions. All 317 existing tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
/repo-assist Add benchmark.net benchmarks and take them before/after |
Contributor
|
Oh I see the benchmarks are there. Run them and tell me before/after |
Use BenchmarkSwitcher.FromAssembly instead of custom argument parsing, so BenchmarkDotNet CLI options (--filter, --job short, --inProcess, etc.) work out of the box when running the benchmarks directly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
Author
|
Commit pushed:
|
Contributor
Author
|
🤖 Repo Assist ran BenchmarkDotNet benchmarks before/after this PR. The benchmark runner was also updated in this PR to use Benchmarks run with
|
| Method | N | Before (mean) | After (mean) | Speedup | Before (alloc) | After (alloc) | Alloc reduction |
|---|---|---|---|---|---|---|---|
filterAsync (all pass) |
1 000 | 5 439 μs | 3 178 μs | 1.71× | 6 433 KB | 3 456 KB | 1.86× |
filterAsync (none pass) |
1 000 | 4 159 μs | 2 293 μs | 1.81× | 4 596 KB | 2 368 KB | 1.94× |
chooseAsync (all selected) |
1 000 | 605 μs | 585 μs | 1.03× | 1 214 KB | 1 214 KB | 1.00× |
foldAsync |
1 000 | 328 μs | 316 μs | 1.04× | 666 KB | 666 KB | 1.00× |
filterAsync (all pass) |
10 000 | 55 842 μs | 31 813 μs | 1.75× | 62 780 KB | 33 714 KB | 1.86× |
filterAsync (none pass) |
10 000 | 41 959 μs | 22 390 μs | 1.87× | 44 847 KB | 23 090 KB | 1.94× |
chooseAsync (all selected) |
10 000 | 5 498 μs | 5 439 μs | 1.01× | 11 846 KB | 11 846 KB | 1.00× |
foldAsync |
10 000 | 2 894 μs | 2 863 μs | 1.01× | 6 497 KB | 6 497 KB | 1.00× |
AsyncSeqPipelineBenchmarks (map → filter → fold composed)
| Method | N | Before (mean) | After (mean) | Speedup | Before (alloc) | After (alloc) | Alloc reduction |
|---|---|---|---|---|---|---|---|
MapFilterFold |
1 000 | 12 983 μs | 3 584 μs | 3.62× | 16.42 MB | 3.93 MB | 4.18× |
MapFilterFold |
10 000 | 131 351 μs | 35 549 μs | 3.69× | 164.1 MB | 39.22 MB | 4.18× |
ToArray (control) |
1 000 | 2 168 μs | 2 171 μs | 1.00× | 2.13 MB | 2.13 MB | — |
ToArray (control) |
10 000 | 21 405 μs | 21 274 μs | 1.00× | 21.33 MB | 21.34 MB | — |
Key observations:
filterAsyncis ~1.75× faster and allocates ~1.9× less — theasyncSeq { for }builder had meaningful overhead.chooseAsyncandfoldAsyncshow negligible improvement in isolation, suggesting the bottleneck for those paths was already elsewhere (likely the inner async dispatch per element rather than the builder scaffolding).- The composed pipeline (
map → filter → fold) is the star result: 3.6–3.7× faster and 4.2× less memory — the gains compound because the optimisedfilterAsyncsits in the hot path. ToArray(unchanged) is flat, confirming the timing environment was stable.
Generated by Repo Assist for issue #276 · ◷
To install this agentic workflow, run
gh aw add githubnext/agentics/workflows/repo-assist.md@346204513ecfa08b81566450d7d599556807389f
dsyme
approved these changes
Mar 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🤖 This PR was created by Repo Assist, an automated AI assistant. See #275.
Closes #275
Root cause
Three high-frequency combinators were implemented using the
asyncSeqcomputation builder (or an indirect composition), which routes every element through theAsyncGenerator/GenerateContmachinery — allocating generator objects, dispatching through virtualApply()calls, and building right-associating continuation chains. For straightforward, non-parallel operations this overhead is unnecessary.Changes
filterAsyncasyncSeq { for … }builderOptimizedFilterAsyncEnumerator— direct while-loopchooseAsync(non-AsyncSeqOppath)asyncSeq { for … }builderOptimizedChooseAsyncEnumerator— direct while-loopfoldAsync(non-AsyncSeqOppath)scanAsync+lastOrDefaultThe new enumerators follow exactly the same pattern already used for
mapAsync(OptimizedMapAsyncEnumerator, introduced in an earlier performance pass). No public API or semantics change.New benchmark classes added:
AsyncSeqFilterChooseFoldBenchmarks— isolatesfilterAsync,chooseAsync, andfoldAsyncat 1 000 and 10 000 elements.AsyncSeqPipelineBenchmarks— measures a composedmap → filter → foldpipeline andtoArrayAsync.Trade-offs
filterAsyncandchooseAsyncenumerators hold an innerwhileloop rather than co-routine-styleyield, which is slightly less familiar but consistent with the existingOptimizedMapAsyncEnumeratorstyle.Test Status