Skip to content

Conversation

@kshyatt
Copy link
Member

@kshyatt kshyatt commented Jan 21, 2026

Added more tests to test_pullbacks_match to make sure the state of the arguments is restored, and the final argument derivatives match between inplace and non in place methods.

Unfortunately, the Mooncake FD tester doesn't work well for our functions, because A becomes a scratch space, and the inputs are also the outputs (so get incremented twice under the FD scheme).

@kshyatt
Copy link
Member Author

kshyatt commented Jan 21, 2026

I'd like to do a bit of tidying up with this one but it can be merged if tests pass and people feel like it, I can just make a separate PR.

@kshyatt kshyatt requested review from Jutho and lkdvos January 21, 2026 19:40
@codecov
Copy link

codecov bot commented Jan 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
...gebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl 96.78% <100.00%> (+1.56%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kshyatt kshyatt force-pushed the ksh/mooncake_inplace branch 3 times, most recently from 8a1283e to 772b61b Compare January 28, 2026 20:20
Copy link
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me!

I have two remaining questions:

  1. This also bumps the Mooncake version, should this come before or after #164 ?
  2. I might be getting this wrong, but I am kind of confused now about why we are zero!ing the dargs. Aren't we supposed to accumulate the contributions of the gradient into that, not setting them to? I.e., should this not just be darg1 += zero(darg1), or effectively we don't have to do that?

[EDIT] ignore this last part, I'm stupid.

@lkdvos lkdvos force-pushed the ksh/mooncake_inplace branch from facdda6 to 4528a73 Compare January 29, 2026 15:28
@kshyatt kshyatt merged commit 62688ed into main Jan 29, 2026
10 checks passed
@kshyatt kshyatt deleted the ksh/mooncake_inplace branch January 29, 2026 16:38
@lkdvos lkdvos mentioned this pull request Jan 29, 2026
Comment on lines +67 to +85
function copy_tangent(var::Mooncake.CoDual, Δargs)
dargs = make_mooncake_fdata(deepcopy(Δargs))
copyto!(Mooncake.tangent(var), dargs)
return
end

function copy_tangent(var::Mooncake.CoDual, Δargs::Tuple)
dargs = make_mooncake_fdata.(deepcopy(Δargs))
for (var_tangent, darg) in zip(Mooncake.tangent(var), dargs)
if var_tangent isa Mooncake.FData
for (var_f, darg_f) in zip(Mooncake._fields(var_tangent), Mooncake._fields(darg))
copyto!(var_f, darg_f)
end
else
copyto!(var_tangent, darg)
end
end
return
end
Copy link
Member

@Jutho Jutho Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these functions have been called copy_tangent! for consistency with Julia naming guidelines?

end

function _get_inplace_derivative(f!, A, ΔA, args, Δargs, ::Nothing, rdata)
function _get_inplace_derivative(f!, A, ΔA, args, Δargs, ::Nothing, rdata; ȳ = Δargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the role of Δargs and ȳ. Is Δargs is just the allocated space for the shadow variables of args, whereas ȳ contains the actual cotangents? But in most cases they are the same, and inplace_out contains Δargs as tangents and copy_tangent(inplace_out, ȳ) ends up copying ȳ into Δargs, i.e. into itself?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, this is just a question to make sure I understand. I don't take an issue with copying into itself in these test methods.

dA_copy_ = Mooncake.arrayify(A, dA_copy)[2]
@test dA_inplace_ ≈ dA_copy_
@test copy_args == inplace_args
if dargs_copy isa Tuple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do the dargs_... variables contain at the end of the _get_..._derivative ? They are all set to zero in the pullback, right? Does it make sense to just test that both dargs_copy and dargs_inplace are both zero, rather than simply testing that they are equal (which is true if they are both zero)?

lkdvos referenced this pull request Jan 30, 2026
* update changelog

* bump v0.6.4

* add list of contributors

* Update Project.toml

Co-authored-by: Jutho <Jutho@users.noreply.github.com>

---------

Co-authored-by: Jutho <Jutho@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants