predict_proba functionality for kNN. Small bugfix. More concise and e… by skywardfire1 · Pull Request #362 · smartcorelib/smartcore

skywardfire1 · 2026-03-20T14:39:15Z

This PR adds:

Bugfix: Robust class selection. Fixed initialization of let mut max_c = 0f64. If something went wrong, the result will always be a "class 0". Now correctly initializes with the first computed probability.
Refactoring & DRY. predict_for_row is now a thin wrapper around predict_proba_for_row, the latter is somewhat of "main routine" for 2 or 3 functions, so we avoid repeating the code.
predict_proba functionality for kNN. Returns class probability distributions for input samples. Works with both search algorithms.
Extensive test suite. Two tests, contains ~15-20 distinct assertion checks, covering:
- Validity of probability distributions (sum to 1.0, range [0, 1])
- Consistency between predict() and predict_proba()
- Equivalence of results across search algorithms
- Edge cases: zero-weight sums, extreme k values, multiclass labels
- Behavior differences between Uniform and Distance weighting
- Batch prediction correctness
Backward compatibility No breaking changes to the public API. All existing tests pass.

I also made a real world test.

This is the predictions of a prev functionality

 [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 5, 0, 1, 1, 5, 0, 1, 1, 2, 0, 2, 2, 0, 0, 4, 3, 3, 3, 3, 3, 0, 1, 0, 4, 4, 4, 4, 4, 4, 4, 5, 5, 1, 5, 5, 5, 5, 5, 0, 0, 7]
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 5, 0, 1, 3, 1, 2, 2, 2, 2, 2, 0, 3, 3, 3, 3, 3, 0, 0, 0, 4, 2, 4, 4, 4, 4, 4, 1, 5, 0, 5, 5, 5, 5, 5, 5, 0, 7]
 [0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 3, 1, 1, 4, 1, 1, 0, 1, 2, 4, 2, 4, 2, 0, 0, 3, 3, 3, 3, 0, 4, 0, 4, 0, 4, 0, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 0, 5, 0]
 [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 5, 1, 1, 1, 1, 1, 2, 4, 2, 2, 4, 1, 2, 3, 3, 3, 3, 3, 4, 0, 4, 4, 0, 4, 3, 4, 4, 4, 5, 5, 1, 5, 5, 0, 5, 5, 5, 0]
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 2, 0, 0, 2, 0, 0, 4, 3, 3, 3, 0, 0, 0, 4, 0, 0, 3, 4, 4, 0, 4, 0, 0, 5, 5, 5, 5, 4, 5, 5, 2]

Those are new ones

 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 5, 0, 1, 1, 5, 0, 1, 1, 2, 0, 0, 2, 0, 0, 4, 3, 3, 3, 3, 3, 0, 1, 0, 4, 0, 4, 4, 4, 4, 4, 5, 5, 1, 0, 1, 5, 5, 5, 0, 2, 7]
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 3, 1, 2, 2, 2, 2, 2, 0, 3, 3, 3, 3, 3, 0, 0, 0, 4, 2, 4, 4, 0, 0, 4, 1, 0, 0, 5, 0, 5, 5, 5, 5, 0, 0]
 [0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 1, 0, 0, 1, 1, 0, 1, 2, 4, 2, 1, 0, 0, 0, 3, 3, 3, 3, 0, 4, 0, 4, 0, 4, 0, 4, 4, 4, 0, 5, 5, 0, 5, 5, 5, 0, 0, 1, 7]
 [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 2, 0, 4, 0, 0, 3, 3, 3, 3, 3, 4, 0, 4, 4, 0, 0, 3, 4, 4, 4, 5, 5, 1, 5, 5, 0, 5, 5, 5, 0]
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 0, 0, 2, 4, 2, 4, 0, 3, 3, 0, 0, 0, 4, 0, 0, 0, 4, 4, 0, 4, 0, 0, 1, 5, 5, 5, 1, 5, 5, 1]

Pls note, that they are not identical. What we should also remember is that every result shown is actually a result of 18 kNN models voting, where every one kNN model is created using smartcore.

The difference, in my opinion, comes from float computing issues. And I did my research to ensure our new kNN realization is slightly more concise and exact. I have made some computations more mathematically correct.

…xact computation. Many tests.

codecov · 2026-03-20T14:41:53Z

Codecov Report

❌ Patch coverage is 62.06897% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.35%. Comparing base (70d8a0f) to head (9b94198).
⚠️ Report is 11 commits behind head on development.

Files with missing lines	Patch %	Lines
src/neighbors/knn_classifier.rs	62.06%	11 Missing ⚠️

Additional details and impacted files

@@               Coverage Diff               @@
##           development     #362      +/-   ##
===============================================
- Coverage        45.59%   44.35%   -1.24%     
===============================================
  Files               93       95       +2     
  Lines             8034     8017      -17     
===============================================
- Hits              3663     3556     -107     
- Misses            4371     4461      +90

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

predict_proba functionality for kNN. Small bugfix. More concise and e…

9b94198

…xact computation. Many tests.

skywardfire1 requested a review from Mec-iS as a code owner March 20, 2026 14:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

predict_proba functionality for kNN. Small bugfix. More concise and e…#362

predict_proba functionality for kNN. Small bugfix. More concise and e…#362
skywardfire1 wants to merge 1 commit intosmartcorelib:developmentfrom
skywardfire1:feature/predict_proba_knn

skywardfire1 commented Mar 20, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

skywardfire1 commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 20, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

skywardfire1 commented Mar 20, 2026 •

edited

Loading