Skip to content

[ml-kem] improve auditability of Lean proof#190

Open
robinhundt wants to merge 5 commits into
alex/mlkem-verification3from
robin/ml-kem-lean-proof-fixes
Open

[ml-kem] improve auditability of Lean proof#190
robinhundt wants to merge 5 commits into
alex/mlkem-verification3from
robin/ml-kem-lean-proof-fixes

Conversation

@robinhundt

@robinhundt robinhundt commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

In this PR, I tried to improve the auditability of the top-level theorems of the ML-KEM proof with the help of claude to address my review comment on #171.
I'm not sure whether I succeeded with this and if this is actually better.

@abentkamp what are your thoughts on this?

@robinhundt robinhundt marked this pull request as ready for review July 1, 2026 16:04
The idea is to reduce the trust that needs to be put into the lift
operations. There are now several theorems that essentially state
that lift behaves as expected.
Additionally, the top level theorems now don't rely on lift in their
post-conditions, but explicitly compare the outputs of the spec and
impl.
@robinhundt robinhundt force-pushed the robin/ml-kem-lean-proof-fixes branch from b9c428e to 13339b9 Compare July 1, 2026 16:06
@abentkamp

Copy link
Copy Markdown
Collaborator

There are really two issues that you are trying to mitigate here:

1. We need to trust that the definitions of the lift functions are correct.

Since the APIs of the spec and the impl don't agree, I don't think there is a way around trusting something here. So ideally that something is small and simple. I think the definitions of the lift functions are already pretty okay, but if there is room to make them simpler, let's do that.

Proving theorems about the lift functions is certainly helpful as a sanity check, but I don't think those theorems should replace the definitions as the source of trust. I am not sure which of the two was your intention, but the introduction of the lift audit seems to claim that we should trust the theorems rather than the definition.

The theorems seem to me more complicated than the definitions because there are two of them (faithfulness & injectivity) and because they contain other definitions such as zmodOfFE, which we need to inspect to audit the theorem statement.

So in short: I like the theorems, but not how the comments describe their purpose.

Another idea: We could implement the lift functions in Rust, which might make it easier to audit them.

2. The top-level lemmas should ensure that the implementation is not off by Q.

I'm not sure I understand this fully: Is it correct that the impl result is always between -3329 and 3329, so that there are always two possible correct values for the impl to return?

Then what you are suggesting seems good. Just two comments:

I think there is a small gotcha around toNat, which will convert negative numbers to 0, I believe. So if the impl is out of its range in the negative and the correct answer happens to be 0, the theorem would not catch that.

I am not a fan of putting all of the logic into the theorem statement. I'd rather put this into a separate definition, i.e., a predicate taking an impl polynomial and a spec polynomial and checking whether they match. Then that predicate could be used in all of the top level statements.

@robinhundt

Copy link
Copy Markdown
Contributor Author

Regarding 1, the intention was to have theorems about lift that would be easier to understand than the definitions of the functions. I agree that this did not succeed and the theorems are actually more complicated.

Regarding 2, If I understand correctly, the impl result should be in [-q/2, q/2] (i.e. [-1664, 1664] if the impl is correct as it works on the "barret representative" of x mod q. By adding q if negative, we should get the actual value mod q that is computed by the spec. Currently, all theorems only state that the impl values satisfy |x| <= 3328 which seems to be weaker than what the impl actually does.

So I guess, no, there are not two possible correct values for the impl to return as I understand it. But this isn't captured by the proof. Thinking about it, it seems what I actually want is to prove that the impl values stay in [-q/2, q/2] and then prove in the top level theorem that the coefficient values mod q is the same as the spec. As the proof is currently written, the impl could return two different values that map to the same x mod q, and that would not be caught by the proof.

@abentkamp

Copy link
Copy Markdown
Collaborator

If there is really only one possible value that the impl could return, then the theorem statement in this PR seems overly complicated (and probably doesn't even capture it fully).

I see two options that seem reasonable:

  • A. As you suggest, revert to using lift on the output and make an additional claim that the impl output is within the expected bounds.
  • B. Use the inverse function of lift on the output, i.e. essentially inverse_lift(spec(lift(x))) = impl(x) for all inputs x.

We need to find out whether there is really only one possible output. I believe at least some variants of Barrett do not guarantee that.

@abentkamp

Copy link
Copy Markdown
Collaborator

I just talked to @karthikbhargavan, and he says that the implementation may be off by Q, which is why he wrote the main theorem as stated.

@robinhundt robinhundt force-pushed the robin/ml-kem-lean-proof-fixes branch from b31eb4f to f40207d Compare July 2, 2026 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants