Skip to content

Conversation

@kshyatt
Copy link
Member

@kshyatt kshyatt commented Jan 15, 2026

We can probably use these PB functions for ChainRules as well though I'm less familiar with that library, so I might need to ask someone else to do the plumbing :)

@kshyatt kshyatt requested review from Jutho and lkdvos January 15, 2026 16:18
@codecov
Copy link

codecov bot commented Jan 16, 2026

Codecov Report

❌ Patch coverage is 0% with 78 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...orOperationsEnzymeExt/TensorOperationsEnzymeExt.jl 0.00% 78 Missing ⚠️
Files with missing lines Coverage Δ
...orOperationsEnzymeExt/TensorOperationsEnzymeExt.jl 0.00% <0.00%> (ø)

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lkdvos
Copy link
Member

lkdvos commented Jan 16, 2026

I'll gladly update the chainrules implementation once we are happy with the names and signatures etc

@kshyatt
Copy link
Member Author

kshyatt commented Jan 16, 2026

Since I'm already modifying the Mooncake stuff let's also wait for Jutho/StridedViews.jl#19 to land so that we can test the complex stuff with those rules.

Copy link
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw you removed the import of the chainrules overloads for tensoralloc and tensorfree. I think we still actually need this, for example in the ManualAllocator case we are both capturing the intermediate tensors in the closures and we might be freeing them at the end of the contraction, so this would lead to a use after free (and possibly segfaults).

For Chainrules, we have taken the cautious approach and simply disabled having temporary tensors and freeing them (which is exactly these overloads, i.e. temp=Val(true) is replaced by temp = Val(false)), and I would think we need this here too?

function ChainRulesCore.rrule(
::typeof(TensorOperations.tensorfree!), allocator = DefaultAllocator()
)
tensorfree!_pullback(Δargs...) = (NoTangent(), NoTangent())
return nothing, tensorfree!_pullback
end
function ChainRulesCore.rrule(
::typeof(TensorOperations.tensoralloc), ttype, structure,
istemp, allocator = DefaultAllocator()
)
output = TensorOperations.tensoralloc(ttype, structure, Val(false), allocator)
function tensoralloc_pullback(Δargs...)
return (NoTangent(), NoTangent(), NoTangent(), NoTangent(), NoTangent())
end
return output, tensoralloc_pullback
end

I'm guessing it will be better to start using the shared pullbacks for Mooncake and Chainrules in separate PRs, so probably best to leave that for now as is?

Otherwise very cool!

@kshyatt
Copy link
Member Author

kshyatt commented Jan 20, 2026

Running the Ubuntu test locally to see if I can reproduce, it's quite odd imo

@kshyatt
Copy link
Member Author

kshyatt commented Jan 20, 2026

Can't repro the ubuntu failure locally... annoying

@kshyatt kshyatt force-pushed the ksh/enzyme branch 2 times, most recently from 8286110 to 2d0a628 Compare January 21, 2026 20:28
@kshyatt
Copy link
Member Author

kshyatt commented Jan 22, 2026

@lkdvos I think is ready to review. 1.10 isn't working but the Enzyme crew are working on figuring it out, I think we can enable those tests once they have this fixed.

@kshyatt kshyatt changed the title Add Enzyme rules and move some logic into pullbacks Add Enzyme rules Jan 23, 2026
@kshyatt
Copy link
Member Author

kshyatt commented Jan 26, 2026

Suggestions seem to have created new test failures, I'll see if I can figure out why

@kshyatt
Copy link
Member Author

kshyatt commented Jan 26, 2026

OK, it looks like the chances to cache_C are responsible, and annoyingly, doing something like

cache_C = !iszero(β_dβ.val) ? copy(C_dC.val) : C_dC.val (for type stability)

doesn't help things. So I might undo the changes but leave some comments to revisit.

@kshyatt
Copy link
Member Author

kshyatt commented Jan 26, 2026

Seems like things are working now :)

kshyatt and others added 4 commits January 27, 2026 11:34
Co-authored-by: Jutho <Jutho@users.noreply.github.com>
Co-authored-by: Jutho <Jutho@users.noreply.github.com>
Co-authored-by: Jutho <Jutho@users.noreply.github.com>
kshyatt and others added 3 commits January 27, 2026 11:59
Co-authored-by: Jutho <Jutho@users.noreply.github.com>
Co-authored-by: Jutho <Jutho@users.noreply.github.com>
Co-authored-by: Jutho <Jutho@users.noreply.github.com>
Jutho
Jutho previously approved these changes Jan 27, 2026
Copy link
Member

@Jutho Jutho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all tests pass, I think this is good to go.

@Jutho
Copy link
Member

Jutho commented Jan 27, 2026

OK, it looks like the chances to cache_C are responsible, and annoyingly, doing something like

cache_C = !iszero(β_dβ.val) ? copy(C_dC.val) : C_dC.val (for type stability)

doesn't help things. So I might undo the changes but leave some comments to revisit.

So this approach did work in the end? Was there another cause for type instabilities then?

@kshyatt
Copy link
Member Author

kshyatt commented Jan 27, 2026

So this approach did work in the end? Was there another cause for type instabilities then?

This did work, it seems to have fixed compiler issues on 1.12. On 1.10, there's a weird memory corruption bug I was unable to diagnose but the Enzyme devs are digging into it.

Jutho
Jutho previously approved these changes Jan 27, 2026
@Jutho
Copy link
Member

Jutho commented Jan 27, 2026

I might have created a merge conflict by merging the compat thing (which jsut reorganized the order of the compat entries).

@kshyatt kshyatt enabled auto-merge (squash) January 27, 2026 12:51
@kshyatt kshyatt merged commit d26b592 into master Jan 27, 2026
12 of 13 checks passed
@kshyatt kshyatt deleted the ksh/enzyme branch January 27, 2026 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants