Skip to content

CLI Improvement#146

Open
BrianSchiller wants to merge 15 commits intomainfrom
cli-improvement
Open

CLI Improvement#146
BrianSchiller wants to merge 15 commits intomainfrom
cli-improvement

Conversation

@BrianSchiller
Copy link
Collaborator

Just 2 small fixes

Copilot AI review requested due to automatic review settings February 27, 2026 10:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request significantly enhances the ASF (Algorithm Selection Framework) with major documentation additions, code refactoring, and CLI improvements. Despite the title "CLI Improvement" and description "Just 2 small fixes", this PR contains extensive changes across 11 files.

Changes:

  • Comprehensive documentation for all algorithm selectors (789 lines covering 19 selectors with detailed descriptions, parameters, and usage guidance)
  • Major refactoring of the SNNAP selector from k-NN with Euclidean distance to a Jaccard-distance-based approach using per-algorithm regression models
  • CLI enhancements including ASlib scenario support, proper boolean flag handling, and improved data alignment
  • API improvements including parameter renaming in SATzilla for clarity, random_state addition to SurvivalAnalysis, type hint corrections, and fallback handling in ISAC

Reviewed changes

Copilot reviewed 1 out of 1 changed files in this pull request and generated no comments.

Show a summary per file
File Description
mkdocs.yml Adds navigation entry for the new selectors documentation page
docs/features/selectors.md New comprehensive documentation covering all 19 algorithm selectors with parameters, workflows, and usage guidance
asf/selectors/survival_analysis.py Adds configurable random_state parameter for differential evolution optimization
asf/selectors/snnap.py Major refactoring: replaces k-NN with Euclidean distance with Jaccard distance on predicted top-n algorithms; uses per-algorithm regression models on z-score normalized performance
asf/selectors/satzilla.py Renames parameter from model_class to label_classifier_model and adds epm_regressor_model for clarity; updates hyperparameter definitions
asf/selectors/rpc_selector.py Adds type hint Any to **kwargs parameter
asf/selectors/parallel_portfolio_selector.py Adds type hint Any to **kwargs parameter
asf/selectors/meta_selector.py Adds comprehensive docstring explaining parameters and requirements
asf/selectors/isac.py Adds global fallback algorithm to handle edge cases where clusters have no valid performance data
asf/selectors/hybrid_decision_tree.py Adds type hint Any to **kwargs parameter and imports Any from typing
asf/selectors/abstract_epm_based_selector.py Simplifies parameter filtering logic by removing unused em_* and use_log10 parameters
asf/cli/cli_train.py Major improvements: adds ASlib scenario support, fixes boolean flag handling with mutually exclusive groups, adds preprocessor/presolver choices, implements data alignment checks, adds random_state parameter for tuning
Comments suppressed due to low confidence (5)

asf/cli/cli_train.py:131

  • The maximize argument handling has changed from accepting a boolean type to using mutually exclusive action flags. However, the default is now None for both --maximize and --minimize, which means when neither flag is provided, the maximize variable will be None. Later in the code (line 387 and 402), there's logic to handle this None case, but it might be clearer to set a proper default value or document this behavior more explicitly to avoid confusion.
    maximize_group = parser.add_mutually_exclusive_group()
    maximize_group.add_argument(
        "--maximize",
        action="store_true",
        default=None,
        help="Maximize the objective",
    )
    maximize_group.add_argument(
        "--minimize",
        action="store_false",
        dest="maximize",
        default=None,
        help="Minimize the objective (default)",
    )

asf/cli/cli_train.py:378

  • The error message should be updated from "--feature-data and --performance-data are required" to something like "--feature-data and --performance-data are required when --aslib-scenario is not provided" to be more accurate about when these arguments are needed.
        if args.feature_data is None or args.performance_data is None:
            parser.error("--feature-data and --performance-data are required")

asf/cli/cli_train.py:51

  • The preprocessor_list construction may include private or internal sklearn classes that start with underscores or are not intended for direct use. Consider adding a filter to exclude names starting with underscore (e.g., and not name.startswith('_')) to avoid exposing internal implementation details.
preprocessor_list = sorted(
    name
    for name in dir(preprocessing)
    if isinstance(getattr(preprocessing, name), type)
    and issubclass(getattr(preprocessing, name), TransformerMixin)
)

asf/cli/cli_train.py:446

  • After aligning features and performance DataFrames to their common index (lines 445-446), the features_running_time DataFrame (if it exists from ASlib scenario) may still have the original, unaligned index. This could cause index mismatch issues when features_running_time is passed to tune_selector. Consider also aligning features_running_time after the index alignment: if features_running_time is not None: features_running_time = features_running_time.loc[common_index]
    if not features.index.equals(performance.index):
        common_index = features.index.intersection(performance.index)
        if common_index.empty:
            parser.error(
                "features and performance indices do not overlap; check your input files"
            )
        print(
            "[WARN] Aligning features/performance indices to shared instances;"
            f" kept {len(common_index)} rows"
        )
        features = features.loc[common_index]
        performance = performance.loc[common_index]

asf/cli/cli_train.py:426

  • If presolver_names is not empty but presolver_budget is 0 (which can happen when presolver_ratio is 0 or very small), the division on line 422 could result in each presolver having a budget of 0. While this doesn't cause a runtime error, it's semantically incorrect to initialize presolvers with zero budget. Consider checking if presolver_budget > 0 before creating presolver instances, or skipping presolver instantiation when the budget is insufficient.
        presolver_classes = []
        if presolver_names:
            presolver_per_budget = presolver_budget / len(presolver_names)
            presolver_classes = [
                getattr(presolving, name)(budget=presolver_per_budget)
                for name in presolver_names
            ]

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link

codecov bot commented Feb 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #146      +/-   ##
==========================================
+ Coverage   84.02%   84.03%   +0.01%     
==========================================
  Files          91       91              
  Lines        7644     7643       -1     
==========================================
  Hits         6423     6423              
+ Misses       1221     1220       -1     
Flag Coverage Δ
full 84.03% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
asf/cli/cli_train.py 56.74% <ø> (+0.31%) ⬆️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants