[ml-kem] improve auditability of Lean proof by robinhundt · Pull Request #190 · celabshq/libcrux-iot

robinhundt · 2026-07-01T15:30:34Z

In this PR, I tried to improve the auditability of the top-level theorems of the ML-KEM proof with the help of claude to address my review comment on #171.
I'm not sure whether I succeeded with this and if this is actually better.

@abentkamp what are your thoughts on this?

The idea is to reduce the trust that needs to be put into the lift operations. There are now several theorems that essentially state that lift behaves as expected. Additionally, the top level theorems now don't rely on lift in their post-conditions, but explicitly compare the outputs of the spec and impl.

abentkamp · 2026-07-02T07:41:44Z

There are really two issues that you are trying to mitigate here:

1. We need to trust that the definitions of the `lift` functions are correct.

Since the APIs of the spec and the impl don't agree, I don't think there is a way around trusting something here. So ideally that something is small and simple. I think the definitions of the lift functions are already pretty okay, but if there is room to make them simpler, let's do that.

Proving theorems about the lift functions is certainly helpful as a sanity check, but I don't think those theorems should replace the definitions as the source of trust. I am not sure which of the two was your intention, but the introduction of the lift audit seems to claim that we should trust the theorems rather than the definition.

The theorems seem to me more complicated than the definitions because there are two of them (faithfulness & injectivity) and because they contain other definitions such as zmodOfFE, which we need to inspect to audit the theorem statement.

So in short: I like the theorems, but not how the comments describe their purpose.

Another idea: We could implement the lift functions in Rust, which might make it easier to audit them.

2. The top-level lemmas should ensure that the implementation is not off by Q.

I'm not sure I understand this fully: Is it correct that the impl result is always between -3329 and 3329, so that there are always two possible correct values for the impl to return?

Then what you are suggesting seems good. Just two comments:

I think there is a small gotcha around toNat, which will convert negative numbers to 0, I believe. So if the impl is out of its range in the negative and the correct answer happens to be 0, the theorem would not catch that.

I am not a fan of putting all of the logic into the theorem statement. I'd rather put this into a separate definition, i.e., a predicate taking an impl polynomial and a spec polynomial and checking whether they match. Then that predicate could be used in all of the top level statements.

robinhundt · 2026-07-02T10:03:19Z

Regarding 1, the intention was to have theorems about lift that would be easier to understand than the definitions of the functions. I agree that this did not succeed and the theorems are actually more complicated.

Regarding 2, If I understand correctly, the impl result should be in [-q/2, q/2] (i.e. [-1664, 1664] if the impl is correct as it works on the "barret representative" of x mod q. By adding q if negative, we should get the actual value mod q that is computed by the spec. Currently, all theorems only state that the impl values satisfy |x| <= 3328 which seems to be weaker than what the impl actually does.

So I guess, no, there are not two possible correct values for the impl to return as I understand it. But this isn't captured by the proof. Thinking about it, it seems what I actually want is to prove that the impl values stay in [-q/2, q/2] and then prove in the top level theorem that the coefficient values mod q is the same as the spec. As the proof is currently written, the impl could return two different values that map to the same x mod q, and that would not be caught by the proof.

abentkamp · 2026-07-02T10:41:17Z

If there is really only one possible value that the impl could return, then the theorem statement in this PR seems overly complicated (and probably doesn't even capture it fully).

I see two options that seem reasonable:

A. As you suggest, revert to using lift on the output and make an additional claim that the impl output is within the expected bounds.
B. Use the inverse function of lift on the output, i.e. essentially inverse_lift(spec(lift(x))) = impl(x) for all inputs x.

We need to find out whether there is really only one possible output. I believe at least some variants of Barrett do not guarantee that.

abentkamp · 2026-07-02T14:56:52Z

I just talked to @karthikbhargavan, and he says that the implementation may be off by Q, which is why he wrote the main theorem as stated.

robinhundt added 3 commits June 29, 2026 17:39

mlkem/lean: change hax-evit git url to ssh

7fa5ee3

flake: update hax-evit requirement for ml-kem proof

a174e24

flake: add lean-lsp-mcp

da00d78

robinhundt marked this pull request as ready for review July 1, 2026 16:04

robinhundt force-pushed the robin/ml-kem-lean-proof-fixes branch from b9c428e to 13339b9 Compare July 1, 2026 16:06

sha3: improve ml-kem lean theorems

f40207d

robinhundt force-pushed the robin/ml-kem-lean-proof-fixes branch from b31eb4f to f40207d Compare July 2, 2026 17:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ml-kem] improve auditability of Lean proof#190

[ml-kem] improve auditability of Lean proof#190
robinhundt wants to merge 5 commits into
alex/mlkem-verification3from
robin/ml-kem-lean-proof-fixes

robinhundt commented Jul 1, 2026 •

edited

Loading

Uh oh!

abentkamp commented Jul 2, 2026

Uh oh!

robinhundt commented Jul 2, 2026

Uh oh!

abentkamp commented Jul 2, 2026

Uh oh!

abentkamp commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

robinhundt commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abentkamp commented Jul 2, 2026

1. We need to trust that the definitions of the lift functions are correct.

2. The top-level lemmas should ensure that the implementation is not off by Q.

Uh oh!

robinhundt commented Jul 2, 2026

Uh oh!

abentkamp commented Jul 2, 2026

Uh oh!

abentkamp commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

robinhundt commented Jul 1, 2026 •

edited

Loading

1. We need to trust that the definitions of the `lift` functions are correct.