Skip to content

split index.Tensor converter for bool vs int indexing#4123

Open
wenbingl wants to merge 1 commit intopytorch:mainfrom
wenbingl:fix-index-tensor-converter
Open

split index.Tensor converter for bool vs int indexing#4123
wenbingl wants to merge 1 commit intopytorch:mainfrom
wenbingl:fix-index-tensor-converter

Conversation

@wenbingl
Copy link
Contributor

@wenbingl wenbingl commented Mar 6, 2026

Description

Add index_has_bool_indices validator and register two separate converters for torch.ops.aten.index.Tensor:

  • Integer indexing (HIGH priority, no output allocator): output shape is deterministic based on index tensor shape.
  • Boolean indexing (requires output allocator): uses nonzero() internally, producing data-dependent output shapes.

Fixes # (issue)
support cuda-graph feature for some beam search based model which failed due to output-allocator limit

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

…ter for bool vs int indexing

Add `index_has_bool_indices` validator and register two separate converters
for `torch.ops.aten.index.Tensor`:
- Integer indexing (HIGH priority, no output allocator): output shape is
  deterministic based on index tensor shape.
- Boolean indexing (requires output allocator): uses nonzero() internally,
  producing data-dependent output shapes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@meta-cla meta-cla bot added the cla signed label Mar 6, 2026
@github-actions github-actions bot added component: conversion Issues re: Conversion stage component: core Issues re: The core compiler component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Mar 6, 2026
@github-actions github-actions bot requested a review from zewenli98 March 6, 2026 22:10
@wenbingl
Copy link
Contributor Author

wenbingl commented Mar 6, 2026

@narendasan , please take a look whether this PR is needed, for cuda-graph.

Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I think this is a totally reasonable approach to WAR DDS cases. One thing that came to mind is if there is a way to make this validator generic because I think it makes sense to have "different" converters which use a the same internal defensive implementation but may have different restrictions with different codepaths like dynamic shape, dds, etc. to register multiple times and use validators to distinguish the restrictions. So a validator could take an arg index and a type allow list and deny list and pass or fail the converter @zewenli98 @apbose what do you think

Maybe a API similar to enforce_tensors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed component: api [Python] Issues re: Python API component: conversion Issues re: Conversion stage component: core Issues re: The core compiler component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants