Add evaluator interface with AD backend extensions#155
Conversation
Implement the named evaluator interface with DerivativeOrder, capabilities, prepare, value_and_gradient, and dimension. Add NamedTuple <-> flat vector utilities shared by all AD extensions. Package extensions for four AD backends: - AbstractPPLForwardDiffExt: ForwardDiff native API (GradientConfig) - AbstractPPLMooncakeExt: Mooncake native API (prepare_gradient_cache) - AbstractPPLEnzymeExt: Enzyme native API (autodiff ReverseWithPrimal) - AbstractPPLDifferentiationInterfaceExt: DI generic fallback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Prune evaluator interface to minimum load-bearing changes - Fix silent vector truncation, Float64 coercion, and DerivativeOrder ordering - Add isolated Julia environments per AD backend under test/ext/ - Add AbstractPPLFiniteDifferencesExt and test_autograd utility - Run ext tests in separate parallel CI jobs; allow Enzyme to fail - Apply JuliaFormatter pass Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Remaining comments which cannot be posted as a review comment to avoid GitHub Rate Limit
JuliaFormatter v1.0.62
[JuliaFormatter v1.0.62] reported by reviewdog 🐶
AbstractPPL.jl/test/varname/varname.jl
Line 151 in fbcde33
[JuliaFormatter v1.0.62] reported by reviewdog 🐶
AbstractPPL.jl/test/varname/varname.jl
Lines 190 to 191 in fbcde33
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #155 +/- ##
==========================================
- Coverage 84.51% 78.08% -6.44%
==========================================
Files 10 14 +4
Lines 562 721 +159
==========================================
+ Hits 475 563 +88
- Misses 87 158 +71 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
AbstractPPL.jl documentation for PR #155 is available at: |
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
d3c32d3 to
011c123
Compare
…ecks Replace Any[]-based _unflatten with recursive peel approach that avoids Union types Enzyme cannot differentiate through. Add @inline to all value_and_gradient methods to prevent boxing the (value, grad) sret tuple. Add DimensionMismatch length checks to all vector adapters. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
b41e68c to
fe53b23
Compare
|
@wsmoses you may want to take a look at this Enzyme ext, especially since the test doesn't fully cover it yet? |
| evaluator = AbstractPPL.prepare(problem, prototype) | ||
| x0 = AbstractPPL.flatten_to_vec(prototype) | ||
| f_vec = let evaluator = evaluator, prototype = prototype | ||
| x -> evaluator(AbstractPPL.unflatten_from_vec(prototype, x)) |
There was a problem hiding this comment.
don't make a closure/capture the evaluator here, just call evaluator directly in autodiff with all of the arguments. Also don't add this extra shim to flatten/unflatten
Enforce prepare-time runtime compatibility for structured evaluator inputs and add direct floating-point vector dispatch across AD backends, including Mooncake. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
gdalle
left a comment
There was a problem hiding this comment.
This 1000-LOC PR introduces a lot of duplication across backends, for reasons that I still don't fully understand. Again, if DI is insufficient in any way for Turing's purposes, feel free to point it out here and I'll work on it: JuliaDiff/DifferentiationInterface.jl#992
| x = AbstractPPL.flatten_to_vec(values) | ||
| val, dx = DI.value_and_gradient(p.f_vec, p.prep, p.backend, x) | ||
| grad_nt = AbstractPPL.unflatten_from_vec(p.prototype, dx) |
There was a problem hiding this comment.
Why flatten and unflatten when you could keep the structured representation?
| struct EnzymePrepared{E,F,T<:Real,P} | ||
| evaluator::E | ||
| f_vec::F | ||
| gradient_buffer::Vector{T} |
There was a problem hiding this comment.
Why a Vector instead of whatever array type is appropriate (e.g. on GPU)?
| dim::Int | ||
| end | ||
|
|
||
| AbstractPPL.capabilities(::Type{<:DIPrepared}) = DerivativeOrder{1}() |
| f_vec = let evaluator = evaluator, prototype = prototype | ||
| x -> evaluator(AbstractPPL.unflatten_from_vec(prototype, x)) | ||
| end | ||
| grad_buf = zeros(eltype(x0), length(x0)) |
There was a problem hiding this comment.
| grad_buf = zeros(eltype(x0), length(x0)) | |
| grad_buf = zero(x0) |
|
|
||
| @inline function AbstractPPL.value_and_gradient(p::EnzymePrepared, values::NamedTuple) | ||
| x = AbstractPPL.flatten_to_vec(values) | ||
| fill!(p.gradient_buffer, 0.0) |
There was a problem hiding this comment.
| fill!(p.gradient_buffer, 0.0) | |
| fill!(p.gradient_buffer, zero(eltype(x))) |
| return offset | ||
| end | ||
|
|
||
| function flatten_to_vec(nt::NamedTuple) |
There was a problem hiding this comment.
There are probably packages for flattening stuff more cleanly
|
|
||
| function flatten_to_vec(nt::NamedTuple) | ||
| n = _scalar_count(nt) | ||
| vec = Vector{_vec_eltype(nt)}(undef, n) |
There was a problem hiding this comment.
Why use a Vector to flatten e.g. a named tuple of GPU arrays?
| end | ||
| function _unflatten(proto::AbstractArray{<:Real}, vec::AbstractVector, offset::Int) | ||
| n = length(proto) | ||
| result = reshape(@view(vec[offset:(offset + n - 1)]), size(proto)) |
There was a problem hiding this comment.
The unflattening doesn't have the same type as the initial input
| # FiniteDifferences has no native AbstractPPL extension, so AbstractPPLDifferentiationInterfaceExt | ||
| # is the only applicable dispatch path for this backend. |
There was a problem hiding this comment.
That is obviously not true, you (or Claude?) literally added that extension above
| const fdm = FiniteDifferences.central_fdm(5, 1) | ||
| const adtype = ADTypes.AutoFiniteDifferences(; fdm) | ||
|
|
||
| @testset "AbstractPPLDifferentiationInterfaceExt" begin |
There was a problem hiding this comment.
The test code is duplicated across all backends, which is a bit of a shame
gdalle
left a comment
There was a problem hiding this comment.
Forgot to pick the correct review option
Tighten the evaluator interface and backend implementations for structured and vector inputs, expand targeted backend coverage, and restore CI matrix job names so required checks report again. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… core interface. This moves the evaluator surface into a self-contained ADProblems module and replaces the old finite-difference test helper with test_autograd backed by AutoFiniteDifferences. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
80fff03 to
3bbe3aa
Compare
… core interface. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3bbe3aa to
4932058
Compare
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
e58de0a to
e1bccf7
Compare
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
39f5b9c to
dc795a4
Compare
penelopeysm
left a comment
There was a problem hiding this comment.
As requested, I've not looked at the AD extensions. The biggest comments I have are
- Consider moving to another package please
- Docs would be really useful.
| function AbstractPPL.prepare( | ||
| adtype::ADTypes.AbstractADType, | ||
| problem, | ||
| x::AbstractVector{<:AbstractFloat}; |
There was a problem hiding this comment.
Is there a reason why AbstractFloat rather than Real? A looser type bound of Real would allow for e.g. higher-order or sparse AD without affecting the common use case of Float64
| problem, | ||
| x::AbstractVector{<:AbstractFloat}; | ||
| check_dims::Bool=true, | ||
| mode::Symbol=:gradient, |
There was a problem hiding this comment.
I think this argument shouldn't be a Symbol (notice that if you pass mode=:NotARealMode, it will do the Jacobian which is probably not what was intended). Either a custom type would be better
abstract type Mode end
struct Gradient <: Mode end
struct Jacobian <: Mode end
struct Foo{F<:Mode}
...
end| using .ADProblems: prepare, value_and_gradient, value_and_jacobian, test_autograd | ||
| export prepare, value_and_gradient, value_and_jacobian, test_autograd |
There was a problem hiding this comment.
Is it mandatory to re-export at the top level? Would it be sufficient to export from the submodule and call it a day?
| - logdensityproblems | ||
| version: | ||
| - '1' | ||
| - 'lts' |
There was a problem hiding this comment.
is there a reason to use lts instead of min which the original CI job uses?
| @@ -0,0 +1,295 @@ | |||
| module ADProblems | |||
|
|
|||
| using ADTypes: ADTypes | |||
There was a problem hiding this comment.
Is this dep actually used in src/? I couldn't see any usage. In which case, this doesn't seem like it's needed here or in Project.toml.
| # follows the most specific positional method, so this fallback cannot forward to | ||
| # a backend method that doesn't itself accept these kwargs. | ||
| function prepare( | ||
| adtype, |
There was a problem hiding this comment.
Like if this was adtype::AbstractADType then I could see us needing the ADTypes dep. But right now that's not the case
| throw( | ||
| ArgumentError( | ||
| "`prepare($(nameof(typeof(adtype)))(), ...)` requires loading the corresponding AD backend.", | ||
| ), | ||
| ) |
There was a problem hiding this comment.
These methods can be handled with error hints too, see e.g. https://github.com/JuliaDiff/DifferentiationInterface.jl/blob/5a3ea3845c90aa69d4d23720f5c299af25268a9a/DifferentiationInterface/src/init.jl#L1-L28 for some ideas perhaps?
|
Thanks for the review. Changes made:
Removed explicit derivative mode —
Dropped Removed LDP capabilities — avoided evaluating user code inside Error handling — removed throwing fallback methods for
NT inputs in LDP ext — removed ADTypes dep — removed Docs — added a dedicated evaluator/AD docs page covering structural vs AD preparation, vector inputs, NamedTuple Docstring typo (line 28), CI ( Not changed:
|
- Broaden AbstractVector{<:AbstractFloat} to {<:Real} across all AD ext
files and ADProblems.jl
- Rename Checked→Validate on VectorEvaluator/NamedTupleEvaluator; add
Trivial type-stability note to docstring
- Replace throwing prepare/value_and_gradient/value_and_jacobian fallback
methods with register_error_hint in __init__; removes method-ambiguity
hazard while keeping informative messages
- Restructure _assert_gradient_capability as two methods; change
test_autograd missing-backend throw from ArgumentError to error()
- Remove unused `using ADTypes: ADTypes` from src/ADProblems.jl; tell
Aqua to ignore the stale-dep finding (ADTypes is used by ext files)
- Add type-level capabilities(::Type{<:AbstractPrepared{...}}) to LDP
ext per LDP convention; keep value-level for NT-backed gradient case
- Remove NamedTupleEvaluator-specific logdensity/dimension/capabilities
from LDP ext (LDP expects flat-vector interface)
- CI: lts→min in ext job; cleaner julia --project=. test/run_ext_tests.jl
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Could you (or Claude) explain why you are rejecting the parts of the review that you rejected? |
|
(I think it's important for future readers to understand why the decisions were made that way) |
- Change `AbstractVector{<:AbstractFloat}` to `<:Real` in all test-local
prepare/callable methods to match the branch-wide convention change
- Fix @test_throws ArgumentError -> MethodError for value_and_gradient
on jacobian-mode prepared objects; the throwing fallback was replaced
by an error hint, so the exception is now MethodError
- Fix minor docstring line break in prepare docstring
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New docs/src/adproblems.md: explains the prepare/value_and_gradient API with runnable @example blocks covering vector inputs, NamedTuple inputs, Jacobians, the no-AD structural path, and a supported-backends table - Move the API @docs block from pplapi.md to adproblems.md; replace with a cross-reference - Add ForwardDiff to docs/Project.toml and trigger its extension in make.jl so the @example blocks execute correctly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use eval(Meta.parse("public ...")) inside @static if VERSION >= v"1.11.0"
so the declaration is invisible to Julia 1.10's parser. On 1.10 the symbols
are still accessible as AbstractPPL.prepare etc. but not injected into the
caller's namespace via `using AbstractPPL`.
Update all callers that relied on the unqualified export:
- test/ADProblems.jl: add explicit `using AbstractPPL: prepare, ...`
- test/ext/ad_tests.jl: qualify test_autograd as AbstractPPL.test_autograd
- docs/src/adproblems.md: add explicit import in setup block; use
fully-qualified names in @docs blocks
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove the explicit gradient/jacobian mode knob so prepared evaluators choose the derivative API from the prototype output, while keeping misuse errors and LDP capabilities explicit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Thanks, Penny. Docs have been added. All other raised issues should be addressed or explained in the comments above. As this will likely only be implemented within TuringLang, I don't think registering a new package for it would be worth EDIT: fixed a few edge cases for AbstractPPLDIExt (1) AutoReverseDiff needs to respect the provided compiled tag (2) AutoEnzyme need |
- Add type-level capabilities(::Type{<:VectorEvaluator}) to LDP ext,
satisfying the LDP convention that capabilities must be defined at both
type and value level (AGENTS.md)
- Use p.f in _test_autograd_ref(NamedTuple) instead of recreating an
equivalent closure
- Inline unflatten into unflatten_to!!, removing the redundant wrapper
- Revise docs/src/adproblems.md for clarity, consistency, and precision:
rewrite intro, rename section, tighten prose, add Jacobian shape note,
fix DifferentiationInterface table row
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_test_autograd_ref is called on any AbstractPrepared, not only FDPrepared. MooncakePrepared (and others) have no f field, so p.f FieldErrors at runtime. Reconstruct the flat closure from p.evaluator instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use the generic DifferentiationInterface extension for Enzyme so backend-specific Enzyme support can live in the integration test path instead of package extensions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Route DifferentiationInterface fallback calls through an explicit constant context so Enzyme does not differentiate evaluator/model state. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nsistency
- Remove unreachable `_value_and_jacobian(DIPrepared{false})` (jacobian path
always creates DIPrepared{true})
- Remove unused `UnknownPrepared`, `run_shared_namedtuple_tests`, and
`QuadraticNTPrepared` test fixtures
- Add `capabilities(::Type{T}) where {T<:VectorEvaluator{<:Any,true}}` so
type-level and value-level dispatch agree (AGENTS.md requirement)
- Avoid redundant `flat_length` recomputation in `flatten_to!!(::Nothing, x)`
- Flatten single-element for-loop in missing-extension test
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| version = "0.14.2" | ||
|
|
||
| [deps] | ||
| ADTypes = "47edcb42-4c32-4615-8424-f2b9edc5f35b" |
There was a problem hiding this comment.
Claude is also wrong about keeping this as a dep. If you need something that is only loaded in the extensions, it should go under weakdeps.
There was a problem hiding this comment.
ADTypes is actually needed as a hard dependency for robust error handling -- it is on the fence, but I think it is useful as a hard dependency as it simplifies weak deps triggering, eg, ForwardDiff extension only needs using AbstractPPL, ForwardDiff.
| function unflatten_to!!(x, buf::AbstractVector) | ||
| n = flat_length(x) | ||
| length(buf) == n || throw( | ||
| DimensionMismatch("Expected a vector of length $n, but got length $(length(buf))."), | ||
| ) | ||
| value, _ = _unflatten(x, buf, 1) | ||
| return value | ||
| end |
There was a problem hiding this comment.
This function is not type-stable for NamedTuples which will probably translate into perf issues:
julia> using AbstractPPL
julia> @inferred AbstractPPL.Utils.unflatten_to!!((a=1, b=2, c=3), zeros(3))
ERROR: return type @NamedTuple{a::Float64, b::Float64, c::Float64} does not match inferred return type NamedTuple
julia> @inferred AbstractPPL.Utils.unflatten_to!!((a=1, b=[2, 3], c=3), zeros(4))
ERROR: return type @NamedTuple{a::Float64, b::Vector{Float64}, c::Float64} does not match inferred return type NamedTupleIt's very likely that you need a generated function to ensure type stability for NamedTuples.
Also tests to check type stability for the various inputs would probably be a good thing. There are loads of examples in DynamicPPL's VNT tests which might be helpful inspiration
- src/utils.jl: rewrite `_unflatten(::NamedTuple)` as `@generated` to unroll over `Names`. The previous recursive `merge`/`Base.tail(Names)` pattern lost the `Names` parameter under inference, breaking `@inferred` callers on NamedTuple inputs. - test/utils.jl: add `@inferred` cases for unflatten_to!! across scalar, mixed, nested, NT-with-tuple, empty-NT, and tuple-with-array inputs. - src/ADProblems.jl: use `ADTypes.AbstractADType` to narrow the prepare/MethodError hint so it only fires for AD backends. - test/Aqua.jl: drop the stale-deps ignore for ADTypes; the dep is now used in src/. - .github/workflows/CI.yml: remove `continue-on-error` from the Ext job. Each matrix entry is its own PR check, so the flag was masking only the workflow-run summary in the Actions tab — Enzyme failures should surface as a red check rather than be silently green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/ADProblems.jl, ext/*: reintroduce `_supports_gradient` trait so
`LogDensityProblems.capabilities` advertises `LogDensityOrder{1}` only
for prepared shapes that actually implement gradients. Previous
evaluator-type-based dispatch advertised gradient capability for any
`VectorEvaluator`-backed `AbstractPrepared`, even ones without a
gradient implementation. The DI extension overrides the trait to track
whether `gradient_prep` is non-`nothing`.
- src/ADProblems.jl: replace integer-input `MethodError` with a clear
`ArgumentError` explaining how to fix the call.
- src/ADProblems.jl: drop redundant `public` declaration; the same
declaration in `src/AbstractPPL.jl` already covers the user-facing
surface.
- src/ADProblems.jl, ext/AbstractPPLDifferentiationInterfaceExt.jl: WHY
comments on the integer-vector rejection and on the missing
`_value_and_jacobian{false}` overload (unreachable by construction).
- test/ext/logdensityproblems/: split the capability test into the
default (no-gradient) case and an override case; the prior test
asserted the over-eager behaviour.
- test/utils.jl, test/ADProblems.jl: move the flatten/unflatten edge
cases next to the rest of the utility tests.
- test/ADProblems.jl, test/ext/ad_tests.jl: assert against the new
ArgumentError message instead of `MethodError`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`AbstractPrepared` is the AD-aware shape by design; subtyping it asserts
that `value_and_gradient` is implemented. The trait was redundant
machinery for a contract the type hierarchy already encodes.
- src/ADProblems.jl: remove `_supports_gradient` and document the
contract on the `AbstractPrepared` docstring.
- ext/AbstractPPLLogDensityProblemsExt.jl: simplify capability dispatch
to always return `LogDensityOrder{1}` for `AbstractPrepared`. Bare
`VectorEvaluator` keeps its trivial-dim distinction.
- ext/AbstractPPLDifferentiationInterfaceExt.jl: drop the
`DIPrepared`-specific override.
- test/ext/logdensityproblems/: replace the override-based test with a
single assertion that any `AbstractPrepared` subtype reports order 1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Type-parameterised dispatch on UseContext was a runtime-determined Bool encoded into the type signature; a runtime field with a single-method branch is equivalent in performance (one well-predicted compare) and removes one parameter from the struct's type signature. Hot-path type stability still passes `@inferred` for both gradient and jacobian specialisations. Also tighten the integer-rejection comment in src/ADProblems.jl from four lines to three. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply the principle that VectorEvaluator/NamedTupleEvaluator are not
themselves differentiable — gradient capability is the contract of the
wrapping `AbstractPrepared`.
- src/ADProblems.jl: drop the `Trivial` type parameter from
`VectorEvaluator` and remove the trivial-dim
`value_and_gradient`/`value_and_jacobian` methods.
- ext/AbstractPPLLogDensityProblemsExt.jl: bare `VectorEvaluator`
unconditionally reports `LogDensityOrder{0}`; remove the
`{V,true}` capability override and the matching
`logdensity_and_gradient` overload.
- ext/AbstractPPLDifferentiationInterfaceExt.jl: empty inputs now
return a `DIPrepared` with `nothing` preps; `value_and_gradient`
and `value_and_jacobian` short-circuit when `length(x) == 0`,
bypassing DI (many backends fail on length-zero arrays).
- test/ADProblems.jl: drop the bare-VectorEvaluator zero-dim test.
- test/ext/differentiation_interface/: replace it with an end-to-end
empty-input test that goes through `prepare(adtype, ...)`.
- test/ext/logdensityproblems/: replace the trivial-dim VectorEvaluator
gradient test with the corrected order-0 assertion.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds a prepared evaluator API to AbstractPPL that decouples log-density
evaluation from automatic differentiation, following the interface sketched
in
docs/src/interface.md.Autograd backends are optional. For example, removing the Enzyme package extension would simply fall back to DifferentiationInterface for
AutoEnzyme. The broader aim is to rely only on public APIs of autodiff backends, while keeping gradient definitions independent of any particular backend implementation.LogDensityFunctiontoNamedLogDensityDynamicPPL.jl#880TODOs