Immutable elementor and parallel MRT records parsing#262
Immutable elementor and parallel MRT records parsing#262digizeph merged 12 commits intobgpkit:mainfrom
Conversation
$ cargo run --release --example parallel_records_to_elem
Compiling bgpkit-parser v0.15.0 (/home/kockt/src/projects/network/bgpkit-parser)
Finished `release` profile [optimized] target(s) in 0.66s
Running `target/release/examples/parallel_records_to_elem`
Total number of routes: 2589146
Time elapsed: 2.4400434s
Total number of records: 966279
Time elapsed: 1.601405902ss (to chunk into raw records)
Total records: 2589146
Time elapsed: 1.75834702s
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #262 +/- ##
==========================================
- Coverage 91.23% 91.09% -0.15%
==========================================
Files 83 83
Lines 14939 14963 +24
==========================================
Hits 13630 13630
- Misses 1309 1333 +24 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
3ab4651 to
6d4532d
Compare
475bd92 to
b055524
Compare
Add BgpkitParser helper methods to simplify PeerIndexTable extraction: - into_elementor_and_raw_record_iter() for raw records - into_elementor_and_record_iter() for parsed records These methods handle the peek/consume pattern automatically, returning an Elementor pre-initialized with PeerIndexTable and an iterator over remaining records.
There was a problem hiding this comment.
Pull request overview
This PR introduces a non-mutating (“immutable”) Elementor conversion path to enable sharing an initialized Elementor across threads and parsing MRT records into BgpElems in parallel, primarily to speed up large RIB/TableDumpV2 processing.
Changes:
- Added
Elementor::record_to_elems_iter(&self, ...)plus new lazy iterator types (RecordElemIter,BgpUpdateElemIter) and explicitElemErrorfor non-mutating conversion. - Added convenience APIs on
BgpkitParserto return anElementorpre-initialized from an initialPeerIndexTablealong with record/raw-record iterators for parallel processing. - Added an example and changelog entry documenting the parallel parsing approach and expected performance gains.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
src/parser/mrt/mrt_elem.rs |
Introduces immutable/lazy element conversion APIs and iterator implementations; refactors existing Vec-based conversion to reuse iterator path. |
src/parser/mod.rs |
Re-exports new public iterator/error types from mrt_elem. |
src/parser/iters/mod.rs |
Adds helper methods to pre-extract PeerIndexTable and return (Elementor, iterator) pairs for parallel workflows. |
src/parser/iters/default.rs |
Tweaks record filtering logic when filters are present. |
examples/parallel_records_to_elem.rs |
New example demonstrating multi-threaded parsing with shared immutable Elementor. |
CHANGELOG.md |
Documents the new immutable/lazy Elementor APIs and the parallel processing example. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/parser/mrt/mrt_elem.rs
Outdated
| RecordElemIter::RibAfi { entries, .. } => (0, Some(entries.len())), | ||
| } |
There was a problem hiding this comment.
RecordElemIter::RibAfi::size_hint returns a lower bound of 0 even though entries is an IntoIter with an exact remaining length. This underestimates iteration size and can lead to missed preallocation opportunities. Consider returning (entries.len(), Some(entries.len())).
src/parser/iters/default.rs
Outdated
| if self.parser.filters.is_empty() | ||
| || elems.iter().any(|e| e.match_filters(&self.parser.filters)) | ||
| { |
There was a problem hiding this comment.
In RecordIterator::next, this branch is only reached when filters is non-empty (due to the earlier if filters.is_empty() { ... } else { ... }). The added self.parser.filters.is_empty() || check is therefore redundant and makes the filter logic harder to follow; consider simplifying to just the any(match_filters) check here.
| if self.parser.filters.is_empty() | |
| || elems.iter().any(|e| e.match_filters(&self.parser.filters)) | |
| { | |
| if elems.iter().any(|e| e.match_filters(&self.parser.filters)) { |
examples/parallel_records_to_elem.rs
Outdated
| } | ||
| println!("\nTotal number of routes (sequential mutable Elementor): {cnt}"); | ||
| println!( | ||
| "Time elapsed: {:?}s (parser -> RawRecord -> Elementor::record_to_elems)", |
There was a problem hiding this comment.
The format string appends a literal s after {:?} for a Duration. Duration's Debug output already includes the unit suffix (e.g., 2.20s), so this will print ...ss. Consider removing the extra s or formatting with as_secs_f64() if you want a numeric seconds value.
| "Time elapsed: {:?}s (parser -> RawRecord -> Elementor::record_to_elems)", | |
| "Time elapsed: {:?} (parser -> RawRecord -> Elementor::record_to_elems)", |
| pub fn record_to_elems_iter(&self, record: MrtRecord) -> Result<RecordElemIter<'_>, ElemError> { | ||
| let timestamp = { | ||
| let t = record.common_header.timestamp; | ||
| if let Some(micro) = &record.common_header.microsecond_timestamp { | ||
| let m = (*micro as f64) / 1000000.0; | ||
| t as f64 + m |
There was a problem hiding this comment.
New public APIs (record_to_elems_iter, bgp_to_elems_iter, bgp_update_to_elems_iter) don’t appear to have direct unit tests, while this module already has tests for record_to_elems. Adding tests that validate iterator output equivalence vs the existing Vec-based methods, plus the new error cases (UnexpectedPeerIndexTable, MissingPeerTable), would help prevent regressions in this parsing-critical code.
- Fix RecordElemIter::RibAfi::size_hint to return accurate lower bound - Remove redundant empty filters check in default iterator - Fix Duration formatting in parallel example - Add comprehensive tests for new immutable Elementor APIs - record_to_elems_iter equivalence tests - Error handling tests (UnexpectedPeerIndexTable, MissingPeerTable) - BGP message type tests - size_hint correctness test
Elementor took
&mut selfwhen convering a MrtRecord to BgpElem. This branch has a version that does not require mutating state.This allows an Elementor to be shared between threads after the peer index table is set.
examples/parallel_records_to_elem.rsshows how this can be used and shows a speedup. In my testing, the speedup is significant. This feels logical to me, the heavy work of parsing the content can now be done on many threads.