Skip to content

Re-run errored cells after soft_definitions become available#3513

Open
PatrickHaecker wants to merge 1 commit intoJuliaPluto:mainfrom
PatrickHaecker:rerun_errored_cells_when_available
Open

Re-run errored cells after soft_definitions become available#3513
PatrickHaecker wants to merge 1 commit intoJuliaPluto:mainfrom
PatrickHaecker:rerun_errored_cells_when_available

Conversation

@PatrickHaecker
Copy link
Copy Markdown

@PatrickHaecker PatrickHaecker commented Apr 6, 2026

When a cell referencing a soft-exported symbol (e.g. January from using Dates) is forced to run before the using cell due to a hard dependency, it errors with UndefVarError. After the using cell executes and provides soft_definitions, the errored cell was not re-executed because it was already in already_run.

Fix this by filtering errored cells that are newly downstream of soft_definitions out of already_run before restarting the reactive loop.

Also merge two nearly-identical if/elseif branches (needs_resolve and has_implicit_usings) into a single unified block, and rename new_new_topology to resolved_topology.

This fix is needed to get a notebook to work with Julia 1.12 which worked with 1.11.

This was developed with support from Claude Opus 4.6.

Try this Pull Request!

Open Julia and type:

julia> import Pkg
julia> Pkg.activate(temp=true)
julia> Pkg.add(url="https://github.com/PatrickHaecker/Pluto.jl", rev="rerun_errored_cells_when_available")
julia> using Pluto

When a cell referencing a soft-exported symbol (e.g. `January` from
`using Dates`) is forced to run before the `using` cell due to a hard
dependency, it errors with UndefVarError. After the `using` cell
executes and provides soft_definitions, the errored cell was not
re-executed because it was already in `already_run`.

Fix this by filtering errored cells that are newly downstream of
soft_definitions out of `already_run` before restarting the reactive
loop.

Also merge two nearly-identical if/elseif branches (needs_resolve and
has_implicit_usings) into a single unified block, and rename
`new_new_topology` to `resolved_topology`.

This was developed with support from Claude Opus 4.6.
Comment thread test/React.jl
Comment on lines +340 to +344
Cell("x = January"),
Cell("""begin
using Dates
y = x
end"""),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have a more complete reproducer of that notebook that fails in 1.12 and not in 1.11? while you can run this notebook because of the cyclic definition bailing on these "using" definitions, it is essentially a cycle. And there is for example no way to store this notebook in a julia script without having an error

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for having a look at the PR, @Pangoraw !

The document is proprietary and quite large. I was really glad that I was finally able to fix it, because the error depends on the evaluation order, which is not visible in the document itself. I lack the time to trace this for the real document.

I think the Pluto model inherently needs to be able to cover dependencies which could by cyclic. If you have

@x b
@y a

both macros could define only a, only b, a and b or neither one. So it's not clear which macro to evaluate first and this probably just needs to be tried, but it needs to be done if there was any progress in evaluation. Of course, as soon as there is no progress being made, another evaluation cycle of all failed expressions does not make any sense anymore and should be stopped.

I am not sure whether all these cases are fully covered, but I am pretty sure that the cyclic nature is the price of the Pluto model that cell order shall not matter in combination with a dynamic language. I don't think we should require that all Pluto scripts can be executed as a script without error. If this were the case, no dependency analysis would be needed in Pluto, because everything would need to be executed in order anyway.

This PR definitely fixes some cases. At least according to the test suite, it does not break anything. Do you have an example of what it breaks? Otherwise it seems to be a clear improvement.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With some bisecting, I managed to come up with an example in finite time. I hope this is more complete without being excessively large.
real-world-example.zip

Copy link
Copy Markdown
Author

@PatrickHaecker PatrickHaecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this example helps

@fonsp
Copy link
Copy Markdown
Member

fonsp commented Apr 9, 2026

Hi! Thanks for taking the time to contribute to Pluto! I haven't looked at the zip you sent yet, but for this PR i would like a more realistic test case. Could you look into that?

For the current test case, I think I disagree that this should run without error: you explicitly tell Pluto to run cell 1 before 2, which should give undefvar.

@PatrickHaecker
Copy link
Copy Markdown
Author

Thanks for taking a first look. Please look at the zip where I stripped down a real notebook to contain one of the problems this PR fixes while still being a "real" notebook. The error without the PR is in TableOfContents().

My mental image about how the Pluto notebook is processed has more gaps than non-gaps, so I have a hard time stripping these things down and an even harder time creating a real test case out of it. I thought that

Notebook([Cell("A"), Cell("B")])

would be the same as if this were written in Pluto so that the order does not matter. However, do I get both of you right, that this is the order after dependency resolution? If so, this is a bit related to #3179 in the sense that Pluto should also work correctly if the cells are not in the dependency-resolved order although the notebook file can then no longer be uses as a Julia script. But if this is the case, this is a different issue maybe to be discussed in #3179 and therefore I understand why the current test is not suitable.

Can you please check whether the notebook in the zip file qualifies as "should work" according to the current definition? It was, after all, created with Pluto.

@Pangoraw
Copy link
Copy Markdown
Member

would be the same as if this were written in Pluto so that the order does not matter. However, do I get both of you right, that this is the order after dependency resolution

this is as cells would be in Pluto, the execution order is computed for each run. It's just that with your example, there is no "right" order.

Cell("x = January"), # undefvar error (January not defined)
Cell("""begin
    using Dates
    y = x
end"""

Cell("""begin
    using Dates
    y = x # undefvar error (x not defined)
end
Cell("x = January"),

in such cases, pluto prefers the explicit order (the first one).

@PatrickHaecker
Copy link
Copy Markdown
Author

Ok, thanks, understood. So shall we proceed with the example from the zip? Any idea how to best create a minimal @test out of it?

@fonsp
Copy link
Copy Markdown
Member

fonsp commented Apr 20, 2026

Hi! I looked at the notebook in the zip, it's indeed surprising that this does not run, but it still seems unrealistic to me – why not just place the using PlutoUI in its own cell? You could also make the MWE smaller by removing the Pkg-related code and files. MWE means that you gradually remove things from your example, see if the problem still exists, and iterate.

I'm focusing on the example right now, instead of the PR change itself, because we can't accommodate for every use case of Pluto. If a bug is too specific/unrealistic, we can't afford to spend time on reviewing the PR, maintaining it in the future, etc. You might just need to continue using Julia 1.11, or change your notebook to make it work in Julia 1.12.

@PatrickHaecker
Copy link
Copy Markdown
Author

Thanks for looking into the notebook!

why not just place the using PlutoUI in its own cell

This is only a simplified version of one of several occurrences of the problem. In this case they are in the same cell because multiple of these definitions are at the top of the document and normally hidden to not have an even larger gap caused by hidden cells.

You could also make the MWE smaller by removing the Pkg-related code and files. MWE means that you gradually remove things from your example, see if the problem still exists, and iterate.

When I provided an MWE @Pangoraw wanted to have a "more complete reproducer". When I provided a more complete reproducer I should provide something more minimal. I understand that you have detailed requirements about the exact test case, but as my knowledge about Pluto is so much smaller than yours, the requirements look conflicting to me and I need guidance here if you think this is faster than just fine-tuning the example yourself:
Removing everything Pkg related and keeping everything else would be both "complete" and MWE?

You might just need to continue using Julia 1.11, or change your notebook to make it work in Julia 1.12.

Unfortunately, Julia 1.11 is no longer supported, so that is no long-term option for us. We failed in adapting the notebook to Julia 1.12. We haven't identified any plausible procedure to do so, because without specific error messages, the only way we know about is to gradually delete content from the notebook until it works again. With multiple of these problems interlaced and long processing times, this just has not converged when we tried it.
Julia 1.10 does not work as we use Julia 1.11 features in this notebook. The notebook, however, might only be one ExpressionExplorer fix away from building on Julia 1.13 with all PRs combined which we handed in.

If a bug is too specific/unrealistic, we can't afford to spend time on reviewing the PR, maintaining it in the future, etc.

This seems to be an especially unfortunate case, as from a user's point of view is feels like a really bad regression (document no longer working at all, very hard to debug, no documentation about this not being supported). But I also understand that it was never your intention to had this working in 1.10 and 1.11 and therefore you do not want to support this in 1.12 or in the future.

I understand that you do this voluntarily and I am grateful for it. Everyone's time is limited and the motivation to fix use cases from others only goes so far. We wanted to reduce your effort as much as possible by analyzing and fixing the problem ourself so that you only need to do the review. However, reviewing is still some work and there is probably no way how we can free you of this if the change should be merged eventually.

So how should we proceed from here on?
We neither have the knowledge nor the capacity in-house to maintain a fork updating it to new Julia versions and we don't have the means to support external development beyond filling the occasional issue and providing some PRs using limited internal development capacity. An upgrade to 1.13 seems to be doable, however.
As we now have a version which works for 1.12 for us, we can move this over into our company-internal registry, try to fix the remaining 1.13 problem and use the remaining support time of Julia 1.12 (and hopefully 1.13) to mitigate our notebooks away from Pluto.
This would allow to close this issue. Shall we do it like this?

@PatrickHaecker
Copy link
Copy Markdown
Author

I've checked what the effort were to make the most troublesome notebook Julia 1.13-compatible with Pluto. It turned out that there are indeed multiple problems. However, with the PRs we handed in for Pluto.jl, PlutoDependencyExplorer.jl and ExpressionExplorer.jl it runs with Julia 1.13 out-of-the box!

I learned multiple things from that:

  • Thanks for your work! Great that a complex notebook already works with Julia 1.13.0-beta3.
  • We'd have enough time to port away from Pluto.
  • Maybe we don't have to, because supporting a new Julia version with Pluto seems to be less effort than what the exhausting 1.12 porting suggested. Maybe 1.13 is just especially lucky, but probably 1.12 was just especially unlucky. On average efforts will probably somewhere in between both.
  • It looks like supporting the logic of this PR were a good service for Pluto users and maintenance efforts for new Pluto versions might actually not be so bad. Speaking for myself as a personal user: I'd really like to have the functionality of this PR in Pluto, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants