Skip to content

ci(probe): pinpoint MB_PB11_ADJUST_NAMES PCH ↔ fragment asymmetry#6080

Closed
Fedr wants to merge 34 commits into
masterfrom
ci/probe-mb-pb11-adjust-names-asymmetry
Closed

ci(probe): pinpoint MB_PB11_ADJUST_NAMES PCH ↔ fragment asymmetry#6080
Fedr wants to merge 34 commits into
masterfrom
ci/probe-mb-pb11-adjust-names-asymmetry

Conversation

@Fedr
Copy link
Copy Markdown
Contributor

@Fedr Fedr commented May 9, 2026

Draft / probe — DO NOT MERGE.

Goal

Pinpoint the exact tool responsible for the MB_PB11_ADJUST_NAMES PCH ↔ fragment macro-mismatch error that surfaced in #6060 and motivated the workaround in #6077.

We know:

The real-bug-triggering trigger is somewhere in the make → bash → clang pipeline, but make --trace shows identical recipe text for both PCH-build and per-fragment-build, so something downstream is asymmetric.

What this PR does

  1. Reverts the ci(windows): replace S3 MSYS2 zip with cached pacman install of clang 18.1.8 #6060 fix so the macro is again passed via -D and the bug reproduces. (compiler_only_flags.txt and generate.mk.)
  2. Adds two argv-logging shims under scripts/diag/:
    • clang-argv-shim.sh is installed into C:\diag\bin as clang++ and prepended on PATH. Logs every argv element it receives (text + hex) to C:\diag\clang-argv.log, then exec's the real /c/msys64/clang64/bin/clang++.exe. Captures what bash actually passes to clang after quote-stripping.
    • diag-bash-shim.sh is passed to make as SHELL=/c/diag/bin/diag-bash. Logs every argv element bash itself receives (text + hex) to C:\diag\shell-recipes.log, then exec's the real /usr/bin/bash. Captures the recipe text make passes to bash before quote-stripping.
  3. Three new workflow steps in build-test-windows.yml (Windows only), inserted before the existing Python-bindings step:
    • PROBE — install shims
    • PROBE — generate bindings under shims (continue-on-error: true)
    • PROBE — dump tool versions + shim logs (if: always())
  4. The existing Generate and build Python bindings step still runs after — it'll fail with the macro mismatch, which is expected. Job ends red on purpose.

What we expect to learn

clang argv (PCH) vs (fragment) shell recipe (PCH) vs (fragment) conclusion
differ differ in the same way make is asymmetric
differ identical bash is asymmetric (quote handling)
identical (any) clang's -D storage / PCH validator is asymmetric — bytes-on-the-wire are equal but clang stores/renders differently between PCH-build and PCH-import

We also dump bash, make, msys2-runtime, and clang versions for the record.

Test plan

  • Run completes the probe step, dumps logs, then fails on the regular Python-bindings step with the expected macro-mismatch error.
  • Both C:\diag\clang-argv.log and C:\diag\shell-recipes.log are non-empty.
  • The log dump shows MB_PB11_ADJUST_NAMES_HEX lines for at least one PCH-build invocation and one fragment-build invocation, ready for byte-by-byte comparison.

After the probe

Fedr and others added 30 commits April 30, 2026 17:00
Stop fetching the ~725 MB msys64_meshlib_mrbind.zip from S3 (which
has been seen returning 404 in CI) and stop relying on the
install-msys2-mrbind composite action. The windows-2025 runner
image already ships:

  * MSYS2 base at C:\msys64 (pacman 6.1.0)
  * Standalone LLVM 20.1.8 at C:\Program Files\LLVM (unused by
    MRBind, kept here for reference)

For MRBind we still need the MSYS2 -clang64 environment (mrbind
links against MSYS2's libclang/libc++, not the Windows-native
LLVM build). So this change pacman-installs the minimal toolchain
into the preinstalled C:\msys64:

  pacman -Sy --noconfirm --needed make \
    mingw-w64-clang-x86_64-{clang,clang-tools-extra,cmake,ninja,libc++}

and points all subsequent MRBind/binding-generation steps at
C:\msys64 by setting MSYS2_DIR (read by install_mrbind_windows_msys2.bat
and generate_win.bat). GETTEXT_ROOT moves from
C:\msys64_meshlib_mrbind\clang64 to C:\msys64\clang64.

The install-msys2-mrbind composite action is no longer used and is
deleted. install_deps_windows_msys2.bat (local-developer install
path) is unchanged: it still creates a separate
C:\msys64_meshlib_mrbind tree, untouched by this CI change.

Note: this swaps an S3-mirror dependency for an MSYS2-mirror
dependency. It also moves clang from a pinned 18.1.8 to whatever
mingw-w64-clang-x86_64-clang is currently in the clang64 repo
(22.x at the time of writing). MRBind's Windows path uses bare
`clang++` (no version suffix) so this works without further
changes; if a specific clang version becomes required later, pin
via pacman or use the MSYS2 archive.
mrbind's CMakeLists does find_package(Clang REQUIRED), which
transitively requires LLVMConfig.cmake from the llvm dev package.
That package is listed only as an *optional* dep of clang, so
pacman did not pull it in automatically — Build MRBind failed
with:

  Could not find a package configuration file provided by "LLVM"
  (requested version 22.1.4)

Add mingw-w64-clang-x86_64-llvm to the install list explicitly.
Two bugs combined to make a real failure look green in
https://github.com/MeshInspector/MeshLib/actions/runs/25392220595/job/74469566838:

1. The 'Generate and build Python bindings' step in
   build-test-windows.yml didn't carry the MSYS2_DIR=C:\msys64 env
   override that the other generate_win.bat / install_mrbind_windows_msys2.bat
   callers on this branch already have. So the script saw the
   default MSYS2_DIR=C:\msys64_meshlib_mrbind, which doesn't exist
   on the runner anymore.

2. generate_win.bat printed 'MSYS2 was NOT found' and then fell
   through to a normal exit, returning code 0. The step's `call`
   inherited that 0, GitHub Actions marked the step green, the
   .pyd never got built, and the failure only surfaced four steps
   later when 'Unit Tests' couldn't load mrmeshpy.pyd
   (LoadLibrary error 126).

Fix both:

* Add `MSYS2_DIR: C:\msys64` to the 'Generate and build Python
  bindings' step's env block, matching Build MRBind / Generate C
  bindings / Generate C# bindings.

* Replace generate_win.bat's silent fall-through with `exit /b 1`
  after the missing-MSYS2 message. Mirrors what
  install_mrbind_windows_msys2.bat already does on the same
  condition. Future workflow callers that forget MSYS2_DIR will
  fail at the right step instead of being papered over until
  something later trips on the missing artifacts.
Add a post-Unit-Tests step (always(), continue-on-error) that
captures state when MRTest.exe's embedded python smoke test
trips `ImportError: initialization failed` from `<string>(2):
<module>`. CPython's wrapper hides the real cause; this dumps:

* contents of source\x64\<config>\ (DLLs, .pyd, EXE) and
  source\x64\<config>\meshlib\
* dumpbin /exports of MR{Test,EmbeddedPython,Python}.{exe,dll}
  and each *.pyd, filtered to PyInit_* — confirms the module
  init function is exported
* dumpbin /dependents of the same, filtered to python*/MR*/CRT —
  shows which python.dll the bindings and the embedded
  interpreter resolve (catches ABI mismatches between vcpkg's
  Python 3.12 and any other python.dll on PATH)
* `py -0p` inventory of available pythons
* a standalone `py -3.12 -c "import meshlib.mrmeshpy"` with
  full traceback — gives the real Python exception instead of
  CPython's wrapped "initialization failed"
* python*.dll listings under C:\vcpkg\installed\...\bin and
  `where.exe` for python3.dll / python312.dll

Step is purely informational; it never fails the job.
Diagnostic on the previous run (PR #6021) revealed that MRBind's
bindings regress when generated with the current MSYS2 mirror's
clang (22.1.4): the `std_vector_const_Mesh` registration is
missing from the generated `mrmeshpy.pyd`, so module init fails
with `AttributeError: module 'meshlib.mrmeshpy' has no attribute
'std_vector_const_Mesh'` — wrapped by CPython as the opaque
`ImportError: initialization failed` MRTest's embedded-python
smoke test surfaces. The historical msys64_meshlib_mrbind.zip
bundle pinned clang 18.1.8-2 specifically because mrbind is
sensitive to libclang's AST shape.

Reproduce that pin without depending on the S3 bundle:

* `Install MRBind toolchain` step now downloads the 9 clang-stack
  .pkg.tar.zst files from `repo.msys2.org/mingw/clang64/` (those
  versioned packages are still served): clang, clang-libs,
  clang-tools-extra, llvm, llvm-libs, libc++, compiler-rt, lld,
  libunwind — all 18.1.8-2-any.
* `pacman -Sy` to refresh DBs, `pacman -S` cmake/ninja/make from
  the current mirror (build tools don't influence binding output
  and don't depend on clang), then `pacman -U` (no --needed) of
  the staged 18.1.8-2 archives so any newer libc++/etc. pulled in
  as cmake/ninja deps gets downgraded to match the pinned clang.
* Same change applied to `pip-build.yml`'s `windows-pip-build`
  job.

Trades the original `vcpkg-export.s3` dependency for
`repo.msys2.org` and locks clang at 18.1.8-2 indefinitely
(unless MSYS2 GCs the historical packages, in which case the
URLs need updating to a still-served version or to a private
mirror).
Previous run hit a dep-graph conflict downgrading libc++ from 22
back to 18:

  error: failed to prepare transaction (could not satisfy dependencies)
  :: installing mingw-w64-clang-x86_64-libc++ (18.1.8-2) breaks
     dependency 'mingw-w64-clang-x86_64-cc-libs' required by
     mingw-w64-clang-x86_64-cmake (and 10 other packages)

`mingw-w64-clang-x86_64-cc-libs` is a virtual marker recently
added to MSYS2 that newer libc++ provides; libc++ 18.1.8-2 predates
it. Tell pacman to treat cc-libs as still installed via
`--assume-installed mingw-w64-clang-x86_64-cc-libs=22.1.4-1` so the
downgrade transaction goes through. The actual runtime symbols
(libc++.dll etc.) ship with libc++ 18 regardless; cc-libs is just
a marker.
Previous run downgraded libc++ from 22 to 18 successfully (with
--assume-installed cc-libs), but cmake.exe and ninja.exe — built
against libc++ 22 ABI — couldn't load against libc++ 18 and exited
127 inside the -clang64 shell, taking down Build MRBind:

  Found MSYS2 at `C:\msys64`.
  The system cannot find the file specified.
  ##[error]Process completed with exit code 127.

libc++'s ABI is forward-loose (older binaries against newer libc++
work) but not backward (newer binaries depend on symbols/layout
not in old libc++). So drop libc++ and libunwind from the pinned
list and only keep the clang frontend / llvm / lld / compiler-rt
packages at 18.1.8-2:

* clang/clang-libs/clang-tools-extra
* llvm/llvm-libs
* compiler-rt
* lld

That keeps mrbind running through the clang 18 frontend (which is
what changed binding output and caused the std_vector_const_Mesh
loss) while leaving the runtime fresh enough that cmake/ninja keep
working. Also drop the --assume-installed cc-libs trick — no longer
needed since we no longer downgrade the libc++ that provides cc-libs.
Use `pacman -U --needed` so any future already-installed package
isn't downgrade-attempted.
Pinning only clang/llvm/lld at 18.1.8-2 wasn't enough — clang's
cc.exe failed to start with STATUS_DLL_NOT_FOUND (exit 127) when
the rest of the runtime (libwinpthread-git, crt-git, headers-git,
libc++) was at current versions. clang 18 binaries were built
against a specific late-2024 ABI snapshot that includes those
matching git revisions, and that snapshot is internally consistent
in a way "clang 18 + everything else current" isn't.

Pin the WHOLE mingw/clang64 toolchain to the era of clang 18.1.8-2
— the same 47-package set the historical msys64_meshlib_mrbind.zip
bundle shipped with (matching libwinpthread-git/crt-git/headers-git
12.0.0.r406, libc++ 18.1.8-2, libunwind 18.1.8-2, cmake 3.31.1-1,
ninja 1.12.1-1, etc.). All 47 URLs are still served on
mirror.msys2.org.

* New file: scripts/mrbind/msys2_package_urls_clang18.txt — the
  pinned URL list, extracted from the msys2_package_urls.txt
  snapshot at commit f0fe83c (filtered to mingw/clang64 only).
* Workflow steps: download all 47, then a single
  `pacman -U --noconfirm --needed --overwrite '*'` so dep
  resolution sees a self-consistent set with no version skews and
  no cc-libs marker confusion. --overwrite '*' tolerates files
  already on disk from the runner image's preinstalled MSYS2.
The two mrbind bumps on this branch (1f98820 'Update mrbind.' and
8070543 'Hopefully fix ambiguous Python names.') brought in newer
mrbind that's coupled to clang 22 — the clang frontend changes the
maintainer also reflected in the lockfile bump. Pinning clang back to
18.1.8-2 in CI but keeping the newer mrbind led to incompatibilities
(`AttributeError: module 'meshlib.mrmeshpy' has no attribute
'std_vector_const_Mesh'` resurfacing as a clang-version-specific PCH
macro-mismatch error from `MB_PB11_ADJUST_NAMES`).

Reset thirdparty/mrbind back to commit 2978653c3 (master's tip
before this PR) so the binding generation sees the same mrbind it
was working with under the original bundle's clang 18.1.8-2.
PR #6060's previous form had the download + install logic open-coded
in the workflow YAML — a PowerShell foreach loop calling
Invoke-WebRequest per package, then a single inline `pacman -U`
shelled out via msys2_shell.cmd. That duplicates the role of the
existing `scripts/mrbind/msys2_download_packages.sh` and
`scripts/mrbind/msys2_install_packages.sh` and skips the sha256
verification the install script does.

Generalize both scripts to take an optional suffix argument (default
empty, preserving the existing master-lockfile behavior). Suffix
`_clang18` → reads `msys2_package_urls_clang18.txt` /
`msys2_package_hashes_clang18.txt` instead.

* `scripts/mrbind/msys2_download_packages.sh _clang18`
  → `wget` the URLs from `msys2_package_urls_clang18.txt` into
    `msys2_packages/`.
* `scripts/mrbind/msys2_install_packages.sh _clang18`
  → `sha256sum -c msys2_package_hashes_clang18.txt`,
    then `pacman -U --noconfirm --needed` over the verified files.

Generate `msys2_package_hashes_clang18.txt` once now (47 entries,
sha256s computed from the live mirror.msys2.org content).

Workflow steps in build-test-windows.yml and pip-build.yml shrink to
two short msys2_shell.cmd invocations: one to seed pacman's DBs
(`pacman -Sy`), and one to run download+install via the scripts.
The PowerShell loop, the staging dir, and the inline pacman -U are
gone.
Last run failed with `bash: scripts/mrbind/msys2_download_packages.sh:
No such file or directory` — by default `msys2_shell.cmd -c` starts
the shell in the user's MSYS2 home (~ → /home/runneradmin), not the
workflow's checkout dir. Add `-here` to keep the checkout dir as the
shell's cwd so the relative `scripts/mrbind/...` paths the workflow
passes resolve.

Same `-here` flag the existing install_mrbind_windows_msys2.bat
already uses for the same reason.
The lockfile is stored as LF in git but Windows runners' default
autocrlf checks it out as CRLF. sha256sum -c then fails to open all
47 listed files because each filename has a trailing carriage return.
Strip CRs as we read the lockfile so it's robust to either checkout
encoding.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…f -D

Defining MB_PB11_ADJUST_NAMES as a -D flag with backslashes inside
nested shell quotes ('"s/\bMR:://g"') is fragile under MSYS2: different
bash/make versions strip a different number of backslashes between the
PCH-build recipe and the per-fragment-build recipe. Clang then refuses
to reuse the PCH because the macro definitions don't match textually:

    error: definition of macro 'MB_PB11_ADJUST_NAMES' differs between
    the precompiled header ('"s/\bMR:://g"') and the command line
    ('"s/\bMR:://g"')

Move the define into a force-included header, so the value travels
through C source — which has its own well-defined string-literal
quoting — instead of through bash. -include takes a bare filename
(no shell-quoting concerns), and scripts/mrbind/ is already on the
include path via -I$(makefile_dir) in generate.mk.

Other -D macros in compiler_only_flags.txt are unaffected because
their values contain no backslashes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… -include

The previous attempt routed the macro via `-include mrbind_pb11_defines.h`
in compiler_only_flags.txt, but clang requires the PCH-import `-include`
to be the FIRST `-include` on the command line. Adding any other
`-include` ahead of it silently disables the PCH:

    warning: precompiled header '<module>.combined_pch.hpp.gch' was
    ignored because '-include <module>.combined_pch.hpp' is not first
    '-include'
    fatal error: '<module>.combined_pch.hpp' file not found

Inject `#include <mrbind_pb11_defines.h>` into the generated
`<module>.combined.hpp` instead. That file is the PCH source, so the
macro is baked into the PCH itself and every fragment that imports the
PCH inherits the definition — no `-include` needed, no flag-order
conflict, and the macro value still travels via C source rather than
through the bash-quoting layer where backslashes get unevenly stripped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The mrbind parser is invoked over <module>.combined.hpp using
$(COMPILER_FLAGS), which doesn't have -I$(makefile_dir) — only
$(COMPILER) does. The unguarded `#include <mrbind_pb11_defines.h>` I
added to .combined.hpp therefore broke parsing:

    fatal error: 'mrbind_pb11_defines.h' file not found

The parser doesn't need MB_PB11_ADJUST_NAMES (it parses the C++ AST,
not the regex-walking code path inside mrbind's pybind11/core.h).
Wrap the include in `#ifndef MR_PARSING_FOR_PB11_BINDINGS`, mirroring
the existing guard around `<pybind11/pybind11.h>` two lines down. PCH
build (`MR_COMPILING_PB11_BINDINGS`, no parser flag) still pulls it in
and bakes the macro into the PCH; fragments inherit via PCH-import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The C bindings target shares the same .combined.hpp recipe but its
parser sets MR_PARSING_FOR_C_BINDINGS, not MR_PARSING_FOR_PB11_BINDINGS.
My `#ifndef MR_PARSING_FOR_PB11_BINDINGS` guard was therefore TRUE
during the C parser pass, so the include fired and broke C bindings
generation:

    fatal error: 'mrbind_pb11_defines.h' file not found

Move the include inside the existing `$(if $(is_py),...)` block in
the recipe so the line is only emitted into the Python target's
.combined.hpp. The MB_PB11_ADJUST_NAMES macro is pybind11-specific
anyway — it has no business showing up in the C bindings TU.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The MRBind-toolchain install step takes ~110 s; ~70 s of that is the
sequential `wget` of 47 pinned .pkg.tar.zst files (~700 MB total) and
~5 s is `pacman -Sy` syncing 6 mirror DBs that we then never use
because we install via `pacman -U` on local files.

Add a GitHub Actions cache around `scripts/mrbind/msys2_packages/`
keyed on the lockfile hash, so the lockfile contents are the cache
identity. On a cache hit `wget -c` (already in the download script)
sees every file present and skips it; the install script's sha256
verify is the safety net. On a miss we re-download and the cache
populates for the next run.

Drop the leading `pacman -Sy --noconfirm`. `pacman -U` operates on
local package files and doesn't need a fresh mirror DB — only that the
runner's preinstalled DB exists, which it does.

Expected: install step ~110 s → ~45 s on cache hit, ~105 s on miss.
Lockfile bumps invalidate naturally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On a cache hit (which works correctly — the cache restores in ~1.6 s)
the install step then spent ~28 s in `wget -c`, which issues an HTTPS
HEAD request per file to check whether the server thinks the local
file is still complete. 47 sequential TLS handshakes × ~0.6 s.

Switch the download script to `wget -nc` so existing local files are
skipped without any network call. Partial / corrupt files are caught
by the sha256 verify in msys2_install_packages.sh — fix is to delete
the cache entry. The cache key is the lockfile hash so the cached
filenames can never be stale relative to what we want.

Also bump actions/cache@v4 → @v5 to match install-cuda's usage.

Expected install-step time on cache hit: ~50 s → ~20 s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hex-dumping the failing build log showed bash received bytewise
identical `-D...='"s/\bMR:::/g"'` on both PCH-build and per-fragment
compile, so neither make nor bash is the culprit. The asymmetry is
internal to clang: its PCH validator re-renders the `-D` macro value
when comparing against the new TU, and the re-rendered form drops a
backslash, producing the textual mismatch we hit. Update the comments
in generate.mk and mrbind_pb11_defines.h to reflect the actual root
cause; the fix (bake the macro into the PCH source via #include) is
unchanged and remains correct — it side-steps the `-D`-flag round-trip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Master uses clang 18.1.8 from its frozen S3 MSYS2 snapshot and does
NOT hit the MB_PB11_ADJUST_NAMES macro-mismatch error, despite using
the same `-D` flag and same generate.mk recipes. So clang itself is
not the asymmetry — when paired with the S3 snapshot's bash/make,
the round-trip is consistent. The trigger is the runner's preinstalled
`C:\msys64`'s bash/make/coreutils being newer/different than the S3
snapshot. The exact tool causing the asymmetry wasn't pinpointed.

The fix (bake macro into PCH source via #include) is unchanged and
correct regardless: it bypasses the entire `-D`-flag pipeline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WIP/draft probe — DO NOT MERGE. Built off PR #6060's branch with the
fix temporarily reverted so the macro-mismatch error reproduces, plus
two argv-logging shims that capture exactly what bytes flow at each
layer of the make → bash → clang pipeline:

  1. `scripts/diag/clang-argv-shim.sh` is symlinked into PATH as
     `clang++` and logs argv (text + hex) before exec'ing the real
     `/c/msys64/clang64/bin/clang++.exe`. Captures what bash actually
     passes to clang after quote-stripping.

  2. `scripts/diag/diag-bash-shim.sh` is passed to make as
     `SHELL=/c/diag/bin/diag-bash` and logs argv (text + hex) before
     exec'ing the real `/usr/bin/bash`. Captures the recipe text make
     passes to bash before quote-stripping.

Three new workflow steps in build-test-windows.yml, before the
existing Generate-Python-bindings step:

  * PROBE — install diagnostic shims
  * PROBE — generate bindings under shims (continue-on-error)
  * PROBE — dump tool versions and shim logs

The fix revert is in `compiler_only_flags.txt` (re-add
`-DMB_PB11_ADJUST_NAMES`) and `generate.mk` (drop the
`#include <mrbind_pb11_defines.h>` injection in the .combined.hpp
recipe).

Expected diagnostic outcomes:

  * If clang-argv hex differs between PCH-build and per-fragment-build
    invocations → bash or make is the culprit (something downstream
    of `--trace`-echoed recipe text).
  * If clang-argv hex matches → the asymmetry is internal to clang's
    `-D` macro storage / PCH validator.
  * Comparing shell-recipe hex against clang-argv hex pins it to bash
    (if make's bytes match clang's bytes but they all differ between
    PCH and fragment paths) vs make (if make passes different bytes
    to bash for the two recipes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bash shim (passed via `make SHELL=...`) broke make's `$(shell ...)`
calls during make's parse phase, aborting the build with
`Command failed with exit code 1: echo '3.12' | tr ' ' '\n' | sort -V`
before reaching any clang invocation. Without the bash shim the
clang shim never ran either, so we got no diagnostics.

Drop the bash shim and keep only the clang++ argv-logging shim
(installed at /c/diag/bin/clang++ with /c/diag/bin first on PATH).
We lose the make→bash byte capture but keep the bash→clang capture,
which is the more important data: comparing the bytes clang receives
in the PCH-build and per-fragment-build invocations directly tells
us whether bash/make is delivering different bytes (→ make+bash
together are the culprit) or identical bytes (→ clang's `-D` storage
or PCH validator is the culprit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fedr added a commit that referenced this pull request May 9, 2026
Per the instrumented probe in PR #6080: a clang++ shim that logged
argv (text + hex) showed the PCH-build invocation receives
`-DMB_PB11_ADJUST_NAMES="s/\bMR:://g"` (one backslash) while the
per-fragment invocation receives `-DMB_PB11_ADJUST_NAMES="s/\bMR:://g"`
(two), despite make's `--trace` echoing identical recipe text for both.
Since bash's `'…'` quote-stripping is deterministic, identical input
must produce identical argv — so the asymmetric input has to come from
make.

The PCH recipe is nested inside `$(if $(is_py), …)` in a
`define module_snippet_build_py` block (which goes through `$(eval)`),
while the fragment recipe is a top-level rule in the same block. The
extra `$(eval)` round of make-variable expansion on the PCH path
appears to consume one backslash that the fragment path keeps.

Master's frozen S3 MSYS2 snapshot ships an older make that evidently
constructs recipe text without the extra strip — same clang 18.1.8
and same `-D` flag work fine there. The fix (bake macro into PCH
source via #include) sidesteps the make recipe pipeline entirely.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fedr added a commit that referenced this pull request May 9, 2026
Per the instrumented probe in PR #6080: a clang++ shim that logged
argv (text + hex) showed the PCH-build invocation receives
`-DMB_PB11_ADJUST_NAMES="s/\bMR:://g"` (one backslash) while the
per-fragment invocation receives `-DMB_PB11_ADJUST_NAMES="s/\bMR:://g"`
(two), despite make's `--trace` echoing identical recipe text for both.
Since bash's `'…'` quote-stripping is deterministic, identical input
must produce identical argv — so the asymmetric input has to come from
make.

The PCH recipe is nested inside `$(if $(is_py), …)` in a
`define module_snippet_build_py` block (which goes through `$(eval)`),
while the fragment recipe is a top-level rule in the same block. The
extra `$(eval)` round of make-variable expansion on the PCH path
appears to consume one backslash that the fragment path keeps.

Master's frozen S3 MSYS2 snapshot ships an older make that evidently
constructs recipe text without the extra strip — same clang 18.1.8
and same `-D` flag work fine there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Fedr
Copy link
Copy Markdown
Contributor Author

Fedr commented May 9, 2026

Probe complete — see results in #6060 / #6077 commit messages and mrbind_pb11_defines.h comment.

Summary: clang++ argv shim showed the PCH-build invocation receives -DMB_PB11_ADJUST_NAMES="s/\bMR:://g" (one backslash) while the per-fragment invocation receives -DMB_PB11_ADJUST_NAMES="s/\bMR:://g" (two), even though make's --trace echoes identical recipe text for both. Since bash's '…' quote-stripping is deterministic, the asymmetric input must come from make — specifically the extra $(eval) round on the PCH recipe (nested inside $(if $(is_py), …)) appears to consume one backslash that the top-level fragment recipe keeps.

@Fedr Fedr closed this May 9, 2026
@Fedr Fedr deleted the ci/probe-mb-pb11-adjust-names-asymmetry branch May 9, 2026 08:51
Previous dump used `msys2_shell.cmd -no-start -defterm -here -clang64`
without `-full-path`, so make wasn't on PATH and we got
`make: command not found`. The actual build invocation uses
`-full-path` (which merges Windows PATH on top of MSYS2 PATH), and
that's where make is found.

Also add `which <tool>` next to each `--version`, and `pacman -Qi`
for the relevant base packages (make/bash/coreutils/msys2-runtime),
so we can see exactly which binary supplies each version and whether
it came from MSYS2 pacman or from Windows-side tools.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Fedr Fedr reopened this May 9, 2026
Inline bash with embedded PowerShell escaping got mangled — the for
loop syntax broke with `$p:: -c: line 3: syntax error: unexpected end
of file from \`for' command on line 1`. Move the dump into a
checked-in shell script and just dispatch to it. Also strip CR before
running, since git autocrlf may have checked it out as CRLF.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Fedr
Copy link
Copy Markdown
Contributor Author

Fedr commented May 9, 2026

Probe complete (run 25602340310). Tool versions captured:

tool master S3 zip runner C:\msys64
GNU make 4.4.1-2 (MSYS2/cygwin, /usr/bin/make) 4.4.1 (mingw64, /c/mingw64/bin/make, NOT pacman-managed)
bash 5.2.037-1 5.3.009-1
coreutils 8.32-5 8.32-5 (same)
msys2-runtime 3.5.4-7 3.6.9-1

Same upstream make version (4.4.1) but different builds — master uses MSYS2's cygwin make; the runner falls through PATH to a Windows-native mingw64 make (the runner's MSYS2 base set doesn't include make). The clang++ argv shim confirms PCH-build receives the macro value with ONE backslash while fragment-build receives TWO, despite identical make --trace output — so the asymmetry is in make+bash, most likely the mingw64 make's Windows-style argv handling interacting with the $(if $(is_py), …)/$(eval) nesting on the PCH path differently than cygwin make does.

Updated comments on #6060 (commit d98e50c) and #6077 (commit 63613f6) reflect the actual data. Closing.

@Fedr Fedr closed this May 9, 2026
Fedr added a commit that referenced this pull request May 12, 2026
… 18.1.8 (#6060)

* ci(windows): use preinstalled MSYS2 + pacman-installed clang

Stop fetching the ~725 MB msys64_meshlib_mrbind.zip from S3 (which
has been seen returning 404 in CI) and stop relying on the
install-msys2-mrbind composite action. The windows-2025 runner
image already ships:

  * MSYS2 base at C:\msys64 (pacman 6.1.0)
  * Standalone LLVM 20.1.8 at C:\Program Files\LLVM (unused by
    MRBind, kept here for reference)

For MRBind we still need the MSYS2 -clang64 environment (mrbind
links against MSYS2's libclang/libc++, not the Windows-native
LLVM build). So this change pacman-installs the minimal toolchain
into the preinstalled C:\msys64:

  pacman -Sy --noconfirm --needed make \
    mingw-w64-clang-x86_64-{clang,clang-tools-extra,cmake,ninja,libc++}

and points all subsequent MRBind/binding-generation steps at
C:\msys64 by setting MSYS2_DIR (read by install_mrbind_windows_msys2.bat
and generate_win.bat). GETTEXT_ROOT moves from
C:\msys64_meshlib_mrbind\clang64 to C:\msys64\clang64.

The install-msys2-mrbind composite action is no longer used and is
deleted. install_deps_windows_msys2.bat (local-developer install
path) is unchanged: it still creates a separate
C:\msys64_meshlib_mrbind tree, untouched by this CI change.

Note: this swaps an S3-mirror dependency for an MSYS2-mirror
dependency. It also moves clang from a pinned 18.1.8 to whatever
mingw-w64-clang-x86_64-clang is currently in the clang64 repo
(22.x at the time of writing). MRBind's Windows path uses bare
`clang++` (no version suffix) so this works without further
changes; if a specific clang version becomes required later, pin
via pacman or use the MSYS2 archive.

* ci(windows): also install mingw-w64-clang-x86_64-llvm package

mrbind's CMakeLists does find_package(Clang REQUIRED), which
transitively requires LLVMConfig.cmake from the llvm dev package.
That package is listed only as an *optional* dep of clang, so
pacman did not pull it in automatically — Build MRBind failed
with:

  Could not find a package configuration file provided by "LLVM"
  (requested version 22.1.4)

Add mingw-w64-clang-x86_64-llvm to the install list explicitly.

* Update mrbind.

* Generate fresh MSYS2 lockfiles.

* We no longer upload zipped MSYS2 to S3.

* ci(windows): make generate_win.bat fail loudly on missing MSYS2

Two bugs combined to make a real failure look green in
https://github.com/MeshInspector/MeshLib/actions/runs/25392220595/job/74469566838:

1. The 'Generate and build Python bindings' step in
   build-test-windows.yml didn't carry the MSYS2_DIR=C:\msys64 env
   override that the other generate_win.bat / install_mrbind_windows_msys2.bat
   callers on this branch already have. So the script saw the
   default MSYS2_DIR=C:\msys64_meshlib_mrbind, which doesn't exist
   on the runner anymore.

2. generate_win.bat printed 'MSYS2 was NOT found' and then fell
   through to a normal exit, returning code 0. The step's `call`
   inherited that 0, GitHub Actions marked the step green, the
   .pyd never got built, and the failure only surfaced four steps
   later when 'Unit Tests' couldn't load mrmeshpy.pyd
   (LoadLibrary error 126).

Fix both:

* Add `MSYS2_DIR: C:\msys64` to the 'Generate and build Python
  bindings' step's env block, matching Build MRBind / Generate C
  bindings / Generate C# bindings.

* Replace generate_win.bat's silent fall-through with `exit /b 1`
  after the missing-MSYS2 message. Mirrors what
  install_mrbind_windows_msys2.bat already does on the same
  condition. Future workflow callers that forget MSYS2_DIR will
  fail at the right step instead of being papered over until
  something later trips on the missing artifacts.

* Try enabling the debug env.

* Hopefully fix ambiguous Python names.

* ci(windows): diagnostic step for embedded-python ImportError

Add a post-Unit-Tests step (always(), continue-on-error) that
captures state when MRTest.exe's embedded python smoke test
trips `ImportError: initialization failed` from `<string>(2):
<module>`. CPython's wrapper hides the real cause; this dumps:

* contents of source\x64\<config>\ (DLLs, .pyd, EXE) and
  source\x64\<config>\meshlib\
* dumpbin /exports of MR{Test,EmbeddedPython,Python}.{exe,dll}
  and each *.pyd, filtered to PyInit_* — confirms the module
  init function is exported
* dumpbin /dependents of the same, filtered to python*/MR*/CRT —
  shows which python.dll the bindings and the embedded
  interpreter resolve (catches ABI mismatches between vcpkg's
  Python 3.12 and any other python.dll on PATH)
* `py -0p` inventory of available pythons
* a standalone `py -3.12 -c "import meshlib.mrmeshpy"` with
  full traceback — gives the real Python exception instead of
  CPython's wrapped "initialization failed"
* python*.dll listings under C:\vcpkg\installed\...\bin and
  `where.exe` for python3.dll / python312.dll

Step is purely informational; it never fails the job.

* ci(windows): pin MRBind clang stack to 18.1.8-2 via MSYS2 archive

Diagnostic on the previous run (PR #6021) revealed that MRBind's
bindings regress when generated with the current MSYS2 mirror's
clang (22.1.4): the `std_vector_const_Mesh` registration is
missing from the generated `mrmeshpy.pyd`, so module init fails
with `AttributeError: module 'meshlib.mrmeshpy' has no attribute
'std_vector_const_Mesh'` — wrapped by CPython as the opaque
`ImportError: initialization failed` MRTest's embedded-python
smoke test surfaces. The historical msys64_meshlib_mrbind.zip
bundle pinned clang 18.1.8-2 specifically because mrbind is
sensitive to libclang's AST shape.

Reproduce that pin without depending on the S3 bundle:

* `Install MRBind toolchain` step now downloads the 9 clang-stack
  .pkg.tar.zst files from `repo.msys2.org/mingw/clang64/` (those
  versioned packages are still served): clang, clang-libs,
  clang-tools-extra, llvm, llvm-libs, libc++, compiler-rt, lld,
  libunwind — all 18.1.8-2-any.
* `pacman -Sy` to refresh DBs, `pacman -S` cmake/ninja/make from
  the current mirror (build tools don't influence binding output
  and don't depend on clang), then `pacman -U` (no --needed) of
  the staged 18.1.8-2 archives so any newer libc++/etc. pulled in
  as cmake/ninja deps gets downgraded to match the pinned clang.
* Same change applied to `pip-build.yml`'s `windows-pip-build`
  job.

Trades the original `vcpkg-export.s3` dependency for
`repo.msys2.org` and locks clang at 18.1.8-2 indefinitely
(unless MSYS2 GCs the historical packages, in which case the
URLs need updating to a still-served version or to a private
mirror).

* ci(windows): pass --assume-installed cc-libs through pacman -U

Previous run hit a dep-graph conflict downgrading libc++ from 22
back to 18:

  error: failed to prepare transaction (could not satisfy dependencies)
  :: installing mingw-w64-clang-x86_64-libc++ (18.1.8-2) breaks
     dependency 'mingw-w64-clang-x86_64-cc-libs' required by
     mingw-w64-clang-x86_64-cmake (and 10 other packages)

`mingw-w64-clang-x86_64-cc-libs` is a virtual marker recently
added to MSYS2 that newer libc++ provides; libc++ 18.1.8-2 predates
it. Tell pacman to treat cc-libs as still installed via
`--assume-installed mingw-w64-clang-x86_64-cc-libs=22.1.4-1` so the
downgrade transaction goes through. The actual runtime symbols
(libc++.dll etc.) ship with libc++ 18 regardless; cc-libs is just
a marker.

* ci(windows): keep libc++/libunwind at current; pin only clang/llvm/lld

Previous run downgraded libc++ from 22 to 18 successfully (with
--assume-installed cc-libs), but cmake.exe and ninja.exe — built
against libc++ 22 ABI — couldn't load against libc++ 18 and exited
127 inside the -clang64 shell, taking down Build MRBind:

  Found MSYS2 at `C:\msys64`.
  The system cannot find the file specified.
  ##[error]Process completed with exit code 127.

libc++'s ABI is forward-loose (older binaries against newer libc++
work) but not backward (newer binaries depend on symbols/layout
not in old libc++). So drop libc++ and libunwind from the pinned
list and only keep the clang frontend / llvm / lld / compiler-rt
packages at 18.1.8-2:

* clang/clang-libs/clang-tools-extra
* llvm/llvm-libs
* compiler-rt
* lld

That keeps mrbind running through the clang 18 frontend (which is
what changed binding output and caused the std_vector_const_Mesh
loss) while leaving the runtime fresh enough that cmake/ninja keep
working. Also drop the --assume-installed cc-libs trick — no longer
needed since we no longer downgrade the libc++ that provides cc-libs.
Use `pacman -U --needed` so any future already-installed package
isn't downgrade-attempted.

* ci(windows): pin the entire clang64 toolchain to clang-18 era

Pinning only clang/llvm/lld at 18.1.8-2 wasn't enough — clang's
cc.exe failed to start with STATUS_DLL_NOT_FOUND (exit 127) when
the rest of the runtime (libwinpthread-git, crt-git, headers-git,
libc++) was at current versions. clang 18 binaries were built
against a specific late-2024 ABI snapshot that includes those
matching git revisions, and that snapshot is internally consistent
in a way "clang 18 + everything else current" isn't.

Pin the WHOLE mingw/clang64 toolchain to the era of clang 18.1.8-2
— the same 47-package set the historical msys64_meshlib_mrbind.zip
bundle shipped with (matching libwinpthread-git/crt-git/headers-git
12.0.0.r406, libc++ 18.1.8-2, libunwind 18.1.8-2, cmake 3.31.1-1,
ninja 1.12.1-1, etc.). All 47 URLs are still served on
mirror.msys2.org.

* New file: scripts/mrbind/msys2_package_urls_clang18.txt — the
  pinned URL list, extracted from the msys2_package_urls.txt
  snapshot at commit f0fe83c (filtered to mingw/clang64 only).
* Workflow steps: download all 47, then a single
  `pacman -U --noconfirm --needed --overwrite '*'` so dep
  resolution sees a self-consistent set with no version skews and
  no cc-libs marker confusion. --overwrite '*' tolerates files
  already on disk from the runner image's preinstalled MSYS2.

* Revert thirdparty/mrbind submodule to the pre-PR baseline

The two mrbind bumps on this branch (1f98820 'Update mrbind.' and
8070543 'Hopefully fix ambiguous Python names.') brought in newer
mrbind that's coupled to clang 22 — the clang frontend changes the
maintainer also reflected in the lockfile bump. Pinning clang back to
18.1.8-2 in CI but keeping the newer mrbind led to incompatibilities
(`AttributeError: module 'meshlib.mrmeshpy' has no attribute
'std_vector_const_Mesh'` resurfacing as a clang-version-specific PCH
macro-mismatch error from `MB_PB11_ADJUST_NAMES`).

Reset thirdparty/mrbind back to commit 2978653c3 (master's tip
before this PR) so the binding generation sees the same mrbind it
was working with under the original bundle's clang 18.1.8-2.

* ci(windows): replace inline pwsh download loop with the existing scripts

PR #6060's previous form had the download + install logic open-coded
in the workflow YAML — a PowerShell foreach loop calling
Invoke-WebRequest per package, then a single inline `pacman -U`
shelled out via msys2_shell.cmd. That duplicates the role of the
existing `scripts/mrbind/msys2_download_packages.sh` and
`scripts/mrbind/msys2_install_packages.sh` and skips the sha256
verification the install script does.

Generalize both scripts to take an optional suffix argument (default
empty, preserving the existing master-lockfile behavior). Suffix
`_clang18` → reads `msys2_package_urls_clang18.txt` /
`msys2_package_hashes_clang18.txt` instead.

* `scripts/mrbind/msys2_download_packages.sh _clang18`
  → `wget` the URLs from `msys2_package_urls_clang18.txt` into
    `msys2_packages/`.
* `scripts/mrbind/msys2_install_packages.sh _clang18`
  → `sha256sum -c msys2_package_hashes_clang18.txt`,
    then `pacman -U --noconfirm --needed` over the verified files.

Generate `msys2_package_hashes_clang18.txt` once now (47 entries,
sha256s computed from the live mirror.msys2.org content).

Workflow steps in build-test-windows.yml and pip-build.yml shrink to
two short msys2_shell.cmd invocations: one to seed pacman's DBs
(`pacman -Sy`), and one to run download+install via the scripts.
The PowerShell loop, the staging dir, and the inline pacman -U are
gone.

* ci(windows): pass -here to msys2_shell.cmd so script paths resolve

Last run failed with `bash: scripts/mrbind/msys2_download_packages.sh:
No such file or directory` — by default `msys2_shell.cmd -c` starts
the shell in the user's MSYS2 home (~ → /home/runneradmin), not the
workflow's checkout dir. Add `-here` to keep the checkout dir as the
shell's cwd so the relative `scripts/mrbind/...` paths the workflow
passes resolve.

Same `-here` flag the existing install_mrbind_windows_msys2.bat
already uses for the same reason.

* ci(windows): strip CR from msys2 lockfile before sha256sum -c

The lockfile is stored as LF in git but Windows runners' default
autocrlf checks it out as CRLF. sha256sum -c then fails to open all
47 listed files because each filename has a trailing carriage return.
Strip CRs as we read the lockfile so it's robust to either checkout
encoding.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): route MB_PB11_ADJUST_NAMES via -include header instead of -D

Defining MB_PB11_ADJUST_NAMES as a -D flag with backslashes inside
nested shell quotes ('"s/\bMR:://g"') is fragile under MSYS2: different
bash/make versions strip a different number of backslashes between the
PCH-build recipe and the per-fragment-build recipe. Clang then refuses
to reuse the PCH because the macro definitions don't match textually:

    error: definition of macro 'MB_PB11_ADJUST_NAMES' differs between
    the precompiled header ('"s/\bMR:://g"') and the command line
    ('"s/\bMR:://g"')

Move the define into a force-included header, so the value travels
through C source — which has its own well-defined string-literal
quoting — instead of through bash. -include takes a bare filename
(no shell-quoting concerns), and scripts/mrbind/ is already on the
include path via -I$(makefile_dir) in generate.mk.

Other -D macros in compiler_only_flags.txt are unaffected because
their values contain no backslashes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): bake MB_PB11_ADJUST_NAMES into the PCH source instead of -include

The previous attempt routed the macro via `-include mrbind_pb11_defines.h`
in compiler_only_flags.txt, but clang requires the PCH-import `-include`
to be the FIRST `-include` on the command line. Adding any other
`-include` ahead of it silently disables the PCH:

    warning: precompiled header '<module>.combined_pch.hpp.gch' was
    ignored because '-include <module>.combined_pch.hpp' is not first
    '-include'
    fatal error: '<module>.combined_pch.hpp' file not found

Inject `#include <mrbind_pb11_defines.h>` into the generated
`<module>.combined.hpp` instead. That file is the PCH source, so the
macro is baked into the PCH itself and every fragment that imports the
PCH inherits the definition — no `-include` needed, no flag-order
conflict, and the macro value still travels via C source rather than
through the bash-quoting layer where backslashes get unevenly stripped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): skip mrbind_pb11_defines.h include during parser pass

The mrbind parser is invoked over <module>.combined.hpp using
$(COMPILER_FLAGS), which doesn't have -I$(makefile_dir) — only
$(COMPILER) does. The unguarded `#include <mrbind_pb11_defines.h>` I
added to .combined.hpp therefore broke parsing:

    fatal error: 'mrbind_pb11_defines.h' file not found

The parser doesn't need MB_PB11_ADJUST_NAMES (it parses the C++ AST,
not the regex-walking code path inside mrbind's pybind11/core.h).
Wrap the include in `#ifndef MR_PARSING_FOR_PB11_BINDINGS`, mirroring
the existing guard around `<pybind11/pybind11.h>` two lines down. PCH
build (`MR_COMPILING_PB11_BINDINGS`, no parser flag) still pulls it in
and bakes the macro into the PCH; fragments inherit via PCH-import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): scope mrbind_pb11_defines.h include to Python target only

The C bindings target shares the same .combined.hpp recipe but its
parser sets MR_PARSING_FOR_C_BINDINGS, not MR_PARSING_FOR_PB11_BINDINGS.
My `#ifndef MR_PARSING_FOR_PB11_BINDINGS` guard was therefore TRUE
during the C parser pass, so the include fired and broke C bindings
generation:

    fatal error: 'mrbind_pb11_defines.h' file not found

Move the include inside the existing `$(if $(is_py),...)` block in
the recipe so the line is only emitted into the Python target's
.combined.hpp. The MB_PB11_ADJUST_NAMES macro is pybind11-specific
anyway — it has no business showing up in the C bindings TU.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): cache pinned MSYS2 packages and drop pacman -Sy

The MRBind-toolchain install step takes ~110 s; ~70 s of that is the
sequential `wget` of 47 pinned .pkg.tar.zst files (~700 MB total) and
~5 s is `pacman -Sy` syncing 6 mirror DBs that we then never use
because we install via `pacman -U` on local files.

Add a GitHub Actions cache around `scripts/mrbind/msys2_packages/`
keyed on the lockfile hash, so the lockfile contents are the cache
identity. On a cache hit `wget -c` (already in the download script)
sees every file present and skips it; the install script's sha256
verify is the safety net. On a miss we re-download and the cache
populates for the next run.

Drop the leading `pacman -Sy --noconfirm`. `pacman -U` operates on
local package files and doesn't need a fresh mirror DB — only that the
runner's preinstalled DB exists, which it does.

Expected: install step ~110 s → ~45 s on cache hit, ~105 s on miss.
Lockfile bumps invalidate naturally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): make cache-hit path skip wget HEAD requests

On a cache hit (which works correctly — the cache restores in ~1.6 s)
the install step then spent ~28 s in `wget -c`, which issues an HTTPS
HEAD request per file to check whether the server thinks the local
file is still complete. 47 sequential TLS handshakes × ~0.6 s.

Switch the download script to `wget -nc` so existing local files are
skipped without any network call. Partial / corrupt files are caught
by the sha256 verify in msys2_install_packages.sh — fix is to delete
the cache entry. The cache key is the lockfile hash so the cached
filenames can never be stale relative to what we want.

Also bump actions/cache@v4 → @v5 to match install-cuda's usage.

Expected install-step time on cache hit: ~50 s → ~20 s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): correct comments — root cause is clang, not bash/make

Hex-dumping the failing build log showed bash received bytewise
identical `-D...='"s/\bMR:::/g"'` on both PCH-build and per-fragment
compile, so neither make nor bash is the culprit. The asymmetry is
internal to clang: its PCH validator re-renders the `-D` macro value
when comparing against the new TU, and the re-rendered form drops a
backslash, producing the textual mismatch we hit. Update the comments
in generate.mk and mrbind_pb11_defines.h to reflect the actual root
cause; the fix (bake the macro into the PCH source via #include) is
unchanged and remains correct — it side-steps the `-D`-flag round-trip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): re-correct comments — MSYS2 environment is the trigger

Master uses clang 18.1.8 from its frozen S3 MSYS2 snapshot and does
NOT hit the MB_PB11_ADJUST_NAMES macro-mismatch error, despite using
the same `-D` flag and same generate.mk recipes. So clang itself is
not the asymmetry — when paired with the S3 snapshot's bash/make,
the round-trip is consistent. The trigger is the runner's preinstalled
`C:\msys64`'s bash/make/coreutils being newer/different than the S3
snapshot. The exact tool causing the asymmetry wasn't pinpointed.

The fix (bake macro into PCH source via #include) is unchanged and
correct regardless: it bypasses the entire `-D`-flag pipeline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): de-paren the MB_PB11_ADJUST_NAMES recipe comment

GNU make's $(call,...) argument parser doesn't track nested parens
in literal text — it takes the first inline `)` as the close of
$(call,...) and feeds the rest of the line to the recipe shell. My
prior comment included the example error text which contained
balanced inline parens like ('"s/\bMR:::/g"') and that broke the
.combined.hpp recipe with:

  /usr/bin/sh: -c: line 1: syntax error near unexpected token `('
  /usr/bin/sh: -c: line 1: `and the command line ('"s/\bMR:::/g"')...

The pre-existing $(call,###...) comments in this file work because
they don't contain inline parens; mine did.

Move the example text out of the recipe comment (it lives in
mrbind_pb11_defines.h already) and add a meta-note warning future
editors not to re-introduce parens here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): pin the MB_PB11_ADJUST_NAMES asymmetry to GNU make

Per the instrumented probe in PR #6080: a clang++ shim that logged
argv (text + hex) showed the PCH-build invocation receives
`-DMB_PB11_ADJUST_NAMES="s/\bMR:://g"` (one backslash) while the
per-fragment invocation receives `-DMB_PB11_ADJUST_NAMES="s/\bMR:://g"`
(two), despite make's `--trace` echoing identical recipe text for both.
Since bash's `'…'` quote-stripping is deterministic, identical input
must produce identical argv — so the asymmetric input has to come from
make.

The PCH recipe is nested inside `$(if $(is_py), …)` in a
`define module_snippet_build_py` block (which goes through `$(eval)`),
while the fragment recipe is a top-level rule in the same block. The
extra `$(eval)` round of make-variable expansion on the PCH path
appears to consume one backslash that the fragment path keeps.

Master's frozen S3 MSYS2 snapshot ships an older make that evidently
constructs recipe text without the extra strip — same clang 18.1.8
and same `-D` flag work fine there. The fix (bake macro into PCH
source via #include) sidesteps the make recipe pipeline entirely.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): refine MB_PB11_ADJUST_NAMES root-cause writeup

Probe captured the actual versions on both sides:

  tool           master S3 zip                  runner C:\msys64
  -------------  -----------------------------  -----------------------------
  GNU make       4.4.1-2 (MSYS2/cygwin build,   4.4.1   (mingw64 build,
                 /usr/bin/make)                  /c/mingw64/bin/make,
                                                 NOT pacman-managed)
  bash           5.2.037-1                      5.3.009-1
  coreutils      8.32-5                         8.32-5  (same)
  msys2-runtime  3.5.4-7                        3.6.9-1

Same upstream make version on both sides but DIFFERENT builds: master
uses MSYS2's cygwin make; the runner uses a Windows-native mingw64
make that came from the runner image's Windows-side MinGW install
(the runner's MSYS2 base set doesn't include make, so PATH lookup
falls through to /c/mingw64/bin/make). My earlier "extra $(eval)
round consumes a backslash" guess was overspecific — same source
make handles $(eval) the same way on both sides. The asymmetric
backslash handling between the PCH and fragment recipes is more
likely a side effect of the mingw64 make's Windows-style argv
quoting interacting with $(if $(is_py), …)/$(eval) differently than
the cygwin make does.

The fix (bake macro into PCH source) is unchanged and bypasses the
make recipe pipeline entirely.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): install MSYS2 cygwin make, drop the PCH-source-include macro fix

The MB_PB11_ADJUST_NAMES PCH/fragment macro-mismatch error on this
branch came from PATH falling through to the runner image's
Windows-native mingw64 make at /c/mingw64/bin/make (the runner's
MSYS2 base set doesn't ship make). Master goes through MSYS2's
cygwin-built make at /usr/bin/make and doesn't trip the issue.

Add MSYS2's `make-4.4.1-2-x86_64.pkg.tar.zst` to the clang18
lockfile (URL + sha256 lifted verbatim from master's lockfile so
the version matches exactly). PATH lookup inside `msys2_shell.cmd
-clang64 -full-path` now finds /usr/bin/make first, before
/c/mingw64/bin/make from the appended Windows PATH. Bump the
"47 packages" comment to 48.

With cygwin make in play the PCH and per-fragment recipes deliver
identical bytes to clang, so the workaround is no longer needed:

  * compiler_only_flags.txt — restore -DMB_PB11_ADJUST_NAMES=...
  * generate.mk — drop the .combined.hpp injection of
    `#include <mrbind_pb11_defines.h>`
  * mrbind_pb11_defines.h — deleted

This makes PR #6077 (which carried only the workaround) obsolete.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): revert the unsuffixed lockfile regeneration

The "Generate fresh MSYS2 lockfiles" commit on this branch updated
scripts/mrbind/msys2_package_urls.txt and msys2_package_hashes.txt
(244 lines each) — the unsuffixed lockfiles that drive master's
historical S3-zip refresh procedure. They're orthogonal to this PR,
which only adds the suffixed clang18 pair. Restore them to master's
content so the PR's diff focuses on what it's actually changing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): derive package URLs from the hash lockfile, drop URLs lockfile

Master kept two MSYS2 lockfiles in lockstep: msys2_package_urls.txt
(URL per line, fed to wget -i) and msys2_package_hashes.txt (sha256 +
*msys2_packages/<filename>, fed to sha256sum -c). The URL is derivable
from the filename — `msys2_remember_current_packages.sh` already
encodes the package-type → URL prefix mapping when generating the
URLs file.

Move that same mapping into msys2_download_packages.sh: parse the
hash file's filenames, compute URL = `https://mirror.msys2.org/<prefix>/<filename>`,
write to a temp URL list, and feed that to wget. With this, the
suffixed `msys2_package_urls_clang18.txt` is redundant — drop it.

The unsuffixed `msys2_package_urls.txt` still exists on master (for
historical reasons / the regenerator's output) but is no longer
read by the download script. Cleaning that up is out of scope here.

Cache key (hashFiles(msys2_package_hashes_clang18.txt)) is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): trim comments, restore step name, drop post-Unit-Tests diagnostic, restore READMEs

Per review feedback:

  1. Shorten the verbose comments in build-test-windows.yml's cache /
     install steps and in msys2_download_packages.sh /
     msys2_install_packages.sh down to 2-3 lines each.
  2. Rename `Install MRBind toolchain (clang 18.1.8 pinned) into runner
     MSYS2` back to `Install MSYS2 for MRBind` (master's name) — both
     in build-test-windows.yml and pip-build.yml.
  3. Drop the `Diagnostic — Python bindings post-Unit-Tests` step that
     was added during the std_vector_const_Mesh investigation — no
     longer needed, and not present in master.
  4. Restore scripts/mrbind/README-generating.md and
     scripts/mrbind/README-updating-clang.md to master's content.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): pin libc++ in gettext-tools install via --assume-installed

`pacman -S --needed gettext-tools` from the runner's fresh MSYS2 DB
resolves to gettext-tools-1.0 with `libc++ >= 22` as a transitive
dep, and silently upgrades our pinned libc++ 18.1.8-2 → 22.1.4-1.
The next step (Build MRBind) then compiles with clang 18 + libc++
22 — libc++ 22 headers reference `__builtin_clzg/ctzg` which clang
18 doesn't have, so the build fails with screenfuls of `'_Tp' does
not refer to a value` and `use of undeclared identifier
'__builtin_clzg'` errors out of `<bit>` and `<charconv>`.

Master's frozen S3-zip MSYS2 has a stale DB and resolves to an
older gettext-tools that doesn't pull in libc++ 22, which is why
master's distribution flow doesn't trip this.

Pass `--assume-installed mingw-w64-clang-x86_64-libc++=18.1.8-2`
so pacman accepts our pinned libc++ as satisfying gettext-tools'
dep and leaves it alone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): use pacman --ignore instead of --assume-installed for libc++

`--assume-installed mingw-w64-clang-x86_64-libc++=18.1.8-2` didn't
help — pacman still pulled in libc++ 22.1.4-1 because gettext-tools
1.0's dep is `libc++>=22`, and our --assume-installed at 18.1.8-2
doesn't satisfy that constraint.

`--ignore PKG` is the right knob: it tells pacman not to upgrade
the named package even if a dep requests a newer version. gettext-
tools and gettext-libtextstyle still install; libc++ stays at the
pinned 18.1.8-2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): drop the clang-22 std_vector_const_Mesh narrative from comment

The "newer clang (>=22) loses the std_vector_const_Mesh registration"
attribution wasn't quite right — that symptom showed up during the
mrbind experiments that motivated this work, not as an inherent
regression in clang 22 itself. Replace with a neutral note that we
pin to the same clang version master ships and that pinning only
clang/llvm causes STATUS_DLL_NOT_FOUND.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): drop msys2_package_urls.txt; URLs are derived from hashes lockfile

Since msys2_download_packages.sh derives URLs from each filename's
package-type prefix, no script reads msys2_package_urls.txt anymore.
The only remaining consumer was msys2_remember_current_packages.sh
(which writes it). Drop the file and trim the regenerator's URL
echoes to leave just hash-file output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(mrbind): drop msys2_package_urls.txt; derive URLs from hashes lockfile

The URLs lockfile was kept in lockstep with the hashes lockfile, but
each URL is fully determined by the package's filename — the
package-type prefix (`mingw-w64-clang-x86_64-…` → `mingw/clang64/`,
etc.) maps directly to the mirror path. The mapping was already
encoded in `msys2_remember_current_packages.sh` on the writer side.

Move the same mapping into `msys2_download_packages.sh`: parse each
`<sha256> *msys2_packages/<filename>` line of the hashes file,
compute `https://mirror.msys2.org/<prefix>/<filename>`, write to a
temp URL list, and feed that to `wget -i`. Drop the now-unused URL
echoes from the regenerator and delete the 139-line URLs lockfile.

No behavior change for the developer-bootstrap flow
(`install_deps_windows_msys2.bat` calls these scripts and downloads
the same set of packages).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(mrbind): strip CR before sha256sum -c in msys2_install_packages.sh

Git autocrlf on Windows checks msys2_package_hashes.txt out as CRLF
by default, and `sha256sum -c` doesn't tolerate filenames with a
trailing \r — every entry fails as

  sha256sum: 'msys2_packages/<file>'$'\r': No such file or directory
                                       : FAILED open or read

Hit by anyone running install_deps_windows_msys2.bat manually on a
Windows checkout. Pipe the lockfile through `tr -d '\r'` before
feeding to sha256sum (and the subsequent mapfile read).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(mrbind): add SUFFIX support + wget -nc to match the version PR #6060 uses

Both scripts now match PR #6060's versions byte-for-byte:

  * msys2_download_packages.sh accepts an optional SUFFIX positional
    arg, reading `msys2_package_hashes<SUFFIX>.txt` (default suffix
    `''`). Switches `wget -c` → `wget -nc` so a cache hit skips
    files already on disk without HEAD-request round-trips; the
    install script's sha256 verify is the safety net for partials.

  * msys2_install_packages.sh accepts the same optional SUFFIX arg
    in addition to the CR-strip already added in this PR.

Master's CI doesn't pass a suffix (the `install_deps_windows_msys2.bat`
local-developer flow stays unchanged, suffix `''`). PR #6060 calls
`msys2_download_packages.sh _clang18` to read its alternate
`msys2_package_hashes_clang18.txt` lockfile. Landing this in master
means PR #6060's diff for these two scripts drops to zero on rebase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): pin gettext-tools-0.22.5-2 in the lockfile, drop runtime pacman -S

Master's `pacman -S gettext-tools` against its frozen pacman DB
resolved to an older gettext-tools that doesn't depend on
`cc-libs`. The runner's fresh DB resolves to gettext-tools 1.0,
which depends on `cc-libs` (an aggregator not in our lockfile) AND
on `libc++>=22` (which would upgrade our pinned libc++ 18). Earlier
attempts:

  * `--assume-installed libc++=18.1.8-2` — pacman ignored it
    because 18 doesn't satisfy `>=22`.
  * `--ignore mingw-w64-clang-x86_64-libc++` — kept libc++ at 18,
    but pacman then bailed on the `cc-libs` dep with
    `warning: cannot resolve "mingw-w64-clang-x86_64-cc-libs"` and
    silently skipped installing gettext-tools at all, so msgfmt was
    missing and translation `.po → .mo` compilation got silently
    skipped by CMake's I18nHelpers.

Pin `mingw-w64-clang-x86_64-gettext-tools-0.22.5-2-any.pkg.tar.zst`
in the clang18 lockfile. Its three deps are all already satisfied:
gettext-runtime (in our lockfile), libiconv (in our lockfile), and
gcc-libs (provided by our pinned libc++-18.1.8-2 via the `provides=`
relationship). pacman -U the lockfile installs gettext-tools cleanly
alongside everything else — no second pacman invocation, no
upload_artifacts-gated branch, msgfmt is present unconditionally.

Lockfile grows from 48 → 49 packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(pip-build): cache pinned MSYS2 packages

Mirror the cache step from build-test-windows.yml — same key
(lockfile hash) and same path (scripts/mrbind/msys2_packages), so
the two workflows share cache entries. A build-test-windows run
that populates the cache makes the next pip-build run a cache hit,
and vice versa.

Saves ~60 s on the wget step per pip-build run on cache hit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(windows): extract Cache + Install MSYS2 steps into a reusable action

build-test-windows.yml and pip-build.yml had identical two-step blocks
(`Cache pinned MSYS2 packages` + `Install MSYS2 for MRBind`). Move them
into `.github/actions/install-msys2-mrbind/action.yml` as a composite
action and have both workflows reference it via `uses:`.

  * build-test-windows.yml: keeps its `if:` guard on the outer step,
    so the composite runs only when mrbind/c-bindings/c-sharp is
    actually needed.
  * pip-build.yml: always invokes the composite (mrbind is always on
    in pip-build).

Same cache key + path as before, so the two workflows continue to
share cache entries. Net: -29 lines in workflows, +24 lines in the
new action.

(The action's path was master's previous composite-action name,
deleted earlier on this branch when the install moved inline. Now
that we have a reusable abstraction again, the path is reused.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Egor Mikhaylov <egor.mikhaylov@meshinspector.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants