Skip to content

Enzyme illegal type analysis on many Turing models #2510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mhauru opened this issue Mar 17, 2025 · 9 comments
Closed

Enzyme illegal type analysis on many Turing models #2510

mhauru opened this issue Mar 17, 2025 · 9 comments
Assignees

Comments

@mhauru
Copy link
Member

mhauru commented Mar 17, 2025

With the long-running conversation in #1887 and some recent Slack comments, there seems to be some confusion as to what is the current state of integrating Enzyme with Turing. I figured it would help to separate out the current blocker into its own issue.

The current problem is this:

julia> module MWE

       using Turing, Enzyme, ADTypes

       @model function gdemo_copy()
           s ~ Beta()
       end

       adtype = AutoEnzyme(; mode=set_runtime_activity(Forward))
       sample(gdemo_copy(), NUTS(; adtype=adtype), 10)

       end
WARNING: replacing module MWE.
Sampling 100%|█████████████████████████████████████████████████████████████████████████████████| Time: 0:00:05
ERROR: Enzyme compilation failed due to illegal type analysis.
 This usually indicates the use of a Union type, which is not fully supported with Enzyme.API.strictAliasing set to true [the default].
 Ideally, remove the union (which will also make your code faster), or try setting Enzyme.API.strictAliasing!(false) before any autodiff call.
 To toggle more information for debugging (needed for bug reports), set Enzyme.Compiler.VERBOSE_ERRORS[] = true (default false)

Caused by:
Stacktrace:
 [1] logpdf
   @ ~/.julia/packages/Distributions/fi8Qd/src/univariates.jl:645
 [2] invlink_with_logpdf
   @ ~/.julia/packages/DynamicPPL/senfM/src/abstract_varinfo.jl:791

Nothing special about Beta here, you can replace it with many other distributions, Exponential, LKJ, Gamma, InverseGamma etc., and get the same result. Maybe any distribution with a non-trivial bijector?

This problem comes up so much that it makes it pointless to mark all failing tests as broken: We would have to mark most AD tests. We should also prioritise this issue over any other failures, since this affects so many models.

@wsmoses says that this sort of illegal type analysis should only come about if there's a type instability. However, we would think that we've made things like invlink_with_logpdf type stable, since otherwise there would be a massive hit to performance.

Action items:

  • @wsmoses, can you comment on why this would have started happening since some time in the fall? I think some of these tests used to pass with earlier versions of Enzyme and Turing. Has Enzyme gotten stricter about these sorts of type issues? I can't think of any relevant changes that would have happened in Turing, these DynamicPPL internals are pretty stable.
  • @mhauru to check with e.g. Cthulhu the type stability of the relevant functions and post results here.
@wsmoses
Copy link
Collaborator

wsmoses commented Mar 17, 2025

yeah for sure! for ease do you mind showing a version with just the single autodiff call?

@wsmoses
Copy link
Collaborator

wsmoses commented Mar 17, 2025

there's also a way to get a super verbose log of things if you want. Add Enzyme.API.printall!(true) and Enzyme.API.printtype!(true), and it will print the LLVM functions, and each internal step in the type analysis lattice. Essentially we can then look for why it deduces two different types as dataflowing to the same location (aka a union in Julia). That's going to be rather verbose though, so minimizing the insides of this would likely be useful for making reading that log manageable

@yebai
Copy link
Member

yebai commented Mar 17, 2025

The same errors are now observed on Bijectors, which also worked with Enzyme previously. This has very likely been observed elsewhere in the Julia ecosystem, e.g. EnzymeAD/Enzyme.jl#2135, and EnzymeAD/Enzyme.jl#2108

EDIT: https://github.com/EnzymeAD/Enzyme.jl/issues?q=is%3Aissue%20state%3Aopen%20illegal%20type%20analysis

@wsmoses
Copy link
Collaborator

wsmoses commented Mar 17, 2025

@yebai that just means there was a union type from julia. Another test case likely doesn't have bearing here unless they both call the same libraries.

@yebai
Copy link
Member

yebai commented Mar 17, 2025

My (general) point is that if Enzyme had better unit and integration tests against Julia syntax, standard libraries, and a select few infrastructure libraries, these would be caught much more quickly and save the community the trouble of minimising them. It is a short-term pain, but it would save the Enzyme developers and the community lots of time.

@wsmoses
Copy link
Collaborator

wsmoses commented Mar 17, 2025

@yebai go for it, PRs welcome!

@yebai
Copy link
Member

yebai commented Mar 17, 2025

@yebai go for it, PRs welcome!

I explained this elsewhere: the Turing.jl team doesn't have the capacity for it. My comments above is a pure suggestion, take it or ignore it.

@mhauru
Copy link
Member Author

mhauru commented Mar 18, 2025

After some Cthulhu, this turned out to be a type stability issue in DPPL as @wsmoses proposed. Fix is incoming here: TuringLang/DynamicPPL.jl#851

@penelopeysm
Copy link
Member

Closed by TuringLang/DynamicPPL.jl#851

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants