doc: Add learned closure demo, add T-torch to landing page + docs#626
doc: Add learned closure demo, add T-torch to landing page + docs#626dionhaefner wants to merge 7 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #626 +/- ##
==========================================
+ Coverage 77.95% 77.99% +0.04%
==========================================
Files 39 39
Lines 4635 4635
Branches 754 754
==========================================
+ Hits 3613 3615 +2
+ Misses 716 714 -2
Partials 306 306 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
I realise this is a draft, just checking in as I was curious (I've been working a universal differential equation approach which has some similarities), for this case it looks like you have Tesseracts that does a single timestep of a very cheap simulation. Does this not go against our advice for sensible runtime for Tesseract as latency/overhead cost will likely dominate here? |
|
@jpbrodrick89 I agree, see the most recent push for a complete refactor :) The new version is still bottlenecked by overhead, but at least it fits the narrative now - in the wild you'd use a much more expensive solver where the balance shifts, this is just the cheap demo you can run in a few minutes. Wdyt? |
…core into dion/demo-closure
| These tests load the solver via ``Tesseract.from_tesseract_api`` (in-process, no | ||
| Docker) so they run fast as a local smoke check. The demo notebook itself uses | ||
| ``Tesseract.from_image`` to serve the solver in a container over HTTP — the same | ||
| ``apply_tesseract`` call path works either way. This is also the same pattern | ||
| that would work with a Fortran solver Tesseract backed by Enzyme or a | ||
| hand-written adjoint: the solver just needs apply + VJP with the interface | ||
| (u, nu_field, dt) -> u_next. The closure stays ordinary PyTorch. |
There was a problem hiding this comment.
I don't mind it (especially if this is just a private test file not part of the demo itself), but just wanted to note there's a strong AI smell about this paragraph.
There was a problem hiding this comment.
I do mind, thanks for catching it.
| "3. **Train a neural network closure end-to-end** through the containerized solver, differentiating through the entire time-stepping loop.\n", | ||
| "4. **Compare the learned closure against baselines** (a pure physics model and a pure ML model).\n", | ||
| "\n", | ||
| "We will replace the viscosity model of a Burgers' equation solver -- normally a hand-tuned constant -- with a small neural network, and train it so that it recovers the true (unknown) viscosity profile from solution data alone.\n", |
There was a problem hiding this comment.
nit: just use an em dash here and elsewhere (unless ruff/ipynb's forbid them)
| # Spatial derivatives via central differences | ||
| dudx = torch.zeros_like(u) | ||
| dudx[1:-1] = (u[2:] - u[:-2]) / (2 * DX) |
There was a problem hiding this comment.
you might want a comment here about why you're not upwinding, using conservative form, ETDRK methods or anything else more fancy (e.g. low Reynolds, no shocks, role of nu_max etc.)
|
Thanks for putting this together @dionhaefner ! I think it reads fairly well but a few things come across as surprising/questionable. Model behaviour and trainingFirstly, the recovered viscosity profile is not very impressive (even though the model appears to be very "predictive"): And the fact that a constant viscosity model does almost as well seems a poor sell for learned closures (and is a big hint your closure would probably train a lot better with a . Part of this I think comes down to the fact that you allowed your closure to be so open-minded about what viscosity could depend on (it only depends on space but you've allowed it to depend on u and du/dx) and that your training data is fairly limited ( Given that your loss function is a sum of square residuals and that you can fit your entire batch into memory, I'd be very tempted to attempt a least squares solve on point-wise residuals Jacobian using forward-mode AD (it can work, even for MLP's of this size, and least square should very efficiently teach it that u and du/dx are not drivers of nu) or at least use L-BFGS rather than Adam. However, I accept that to Torch users a recipe of how to do the standard train an MLP with Adam using Tesseracts is the story you're committed to telling. Single step simulatorI think this is the bit we need to think very very carefully about. This demo could essentially become considered a recipe of how to do Neural ODE's with Tesseracts. Do we really want to suggest a single timestep Tesseract be called potentially thousands of times to pass the output of the closure through (especially without any hint of how one could plug checkpointing in)? I'm not sure this is the best use case for Tesseracts unless you have simulators in mind that don't need very many timesteps but each timestep takes a while. My impression is the problem you're trying to solve is how to do Neural ODE's for differentiable-solvers not written in an ML-friendly framework (e.g. Enzyme). (Because neural ODE's in JAX is essentially a solved problem with diffrax.) Is the huge overhead one will inevitably incur with this approach worth saving the pain of just writing an MLP in Enzyme-compatible Fortran/Julia and wrapping the whole simulator+closure model in a single Tesseract? If you're just wanting a simple sell for Tesseract-Torch it might be better to just do something more vanilla, where we don't do a true "neural ODE" but instead solve the "inverse problem" of learning a "viscosity" field which we know a priori is only a function of space. In that case you don't need to pass an updated version of the viscosity field at every simulation timestep you can just pass it through at initialisation and the Tesseract could run a whole simulation with BatchingIs it possible to have a tesseract-torch vmap batching rule in the future so your Tesseract can run all training simulations at once instead of with Python list comprehensions? Or is this currently fundamental the torchfun vs autograd limitaiton? |

Relevant issue or PR
n/a
Description of changes
This celebrates the release of Tesseract-Torch 🎉
It adds Tesseract-Torch to all places where Tesseract-JAX is mentioned (except a few demos that explicitly built on top of T-JAX). Also features a brand new demo ("learned closure") that uses Tesseract-Torch.
Testing done
Docs builds pass on CI, demo runs end-to-end on my machine and on CI. Docs preview: https://pasteur-labs-docs--626.com.readthedocs.build/projects/tesseract-core/626/