xcif parser now wired to the iotbx.cif.reader by olegsobolev · Pull Request #1149 · cctbx/cctbx_project

olegsobolev · 2026-04-15T23:41:30Z

xcif — a fast C++ CIF parser — is now wired into iotbx.cif.reader with a single global switch in iotbx/cif/__init__.py (the module-level DEFAULT_ENGINE constant) plus a per-call engine= kwarg on the reader. DEFAULT_ENGINE stays "ucif" in this PR; callers who want to opt in today can pass engine="xcif" explicitly, and a one-line follow-up will flip the default after bake-in.

Whole set of tests including Phenix ones are passing with the switch flipped.

Benchmark on a 34 MB mmCIF (4v9d.cif): ~6× speedup end-to-end through iotbx.cif.reader, ~21× for the raw parser alone.

…trtod_l using a static C locale — thread-safe, zero overhead - Debug prints removed (tst_tokenizer.cpp): 3 leftover cout lines gone - Missing tests added (tst_numeric.cpp): as_int at INT_MAX/INT_MIN, overflow above/below both bounds, as_double_with_su with exponent+SU ("1.5e+3(2)" → (1500.0, 0.2)) - Python test added (tst_bindings.py): column_as_flex_int with a . cell raises ValueError - nullptr style (xcif_ext.cpp): two p(0) initializers changed to p(nullptr)

tst_numeric.cpp (3 new tests in §5b): - test_double_long_value_nan — values ≥32 chars hit the buffer limit and return NaN - test_su_missing_close_paren — "1.5(" without closing ) returns (1.5, 0.0) gracefully - test_su_nonnumeric_in_su_content — "1.5(3a5)" accumulates digits, skips 'a' → su=3.5 tst_tokenizer.cpp (3 new tests): - test_unterminated_single_quote_returns_rest — graceful degradation, no crash - test_unterminated_double_quote_returns_rest — same for double quotes - test_null_byte_truncates_unquoted_token — '\0' stops unquoted value at "val" tst_error_handling.cpp (2 new tests in §5): - test_unterminated_quoted_string_graceful — parse succeeds, tag is accessible - test_unclosed_semicolon_field_graceful — parse succeeds, tag is accessible tst_api_compat.py (new exercise_loop_category): - Empty list, single tag (with/without dot), multi-tag dot-category, multi-tag underscore-category edge cases

Root cause: locale_t, newlocale(), and strtod_l() are POSIX extensions — they exist on macOS and Linux but not on Windows/MSVC. MSVC provides equivalent but differently-named APIs: _locale_t, _create_locale(), and _strtod_l(). Fix in numeric.cpp: Replaced the single POSIX-only xcif_c_locale() function with a platform-selected xcif_strtod_c() wrapper using #ifdef _WIN32:

Two related changes, both needed for parsing the cctbx monomer library (mon_lib_list.cif and friends start with a CIF 1.1 `global_` block). - Tokenizer: the exact 7-char token `global_` (case-insensitive) now produces TOKEN_BLOCK_HEADER instead of TOKEN_VALUE. Parser preserves the full `global_` name (no `data_` prefix strip) and backs the Block name_ string_view with static storage so it survives Document copy/move. - parse()/parse_file() take bool strict=true; xcif_ext exposes it as a kwarg. With strict=false, pair items or loops appearing before the first data_ block attach to an implicit `global_` block, matching ucif's non-strict behaviour. Tests: new xcif/regression/cpp/tst_non_strict_mode.cpp and xcif/regression/tst_non_strict_mode.py. tst_error_handling and all other strict-mode assertions unchanged.

example.cif was enriched in df37a0d ("Adjust mmCIF example to conform dictionary") — gaining extra loops and tags — but the tokenize/parse demo tests were left asserting the old counts.

whitespace or EOF CIF 1.1 §2.2.7.3 specifies that the closing delimiter of a quoted string (' for single, " for double) is only treated as a close when followed by whitespace or end-of-input. A delimiter followed by any other character is part of the string content. xcif's tokenizer previously terminated on the first delimiter unconditionally, breaking real-world values like the cctbx monomer library's 'N-METHYL-PYRIDOXAL-5'-PHOSPHATE ' (mon_lib_list.cif:1688) where the inner `'` precedes `-` (non-whitespace). Fix: when encountering the delimiter, peek at the next char and include the delimiter in the string unless that next char is whitespace or EOF. Tests: new xcif/regression/cpp/tst_quoted_string_delimiter.cpp and xcif/regression/tst_quoted_string_delimiter.py cover plain strings, embedded delimiters (single and double quotes), multiple embedded apostrophes, and close-by-tab / close-by-EOL / close-by-EOF.

Add engine="ucif"|"xcif" to iotbx.cif.reader and a module-level DEFAULT_ENGINE (default "ucif"). Callers can opt in per-instance with reader(..., engine="xcif"). When engine="xcif", the reader calls xcif_ext.parse() and walks the resulting Document in Python, driving the existing cif_model_builder via its add_data_block / add_data_item / add_loop / start_save_frame / end_save_frame callbacks. Downstream code still receives mutable iotbx.cif.model.cif objects, so mutation sites (restraint_file_merge, pdb_interpretation, as_cif_block writers) are unaffected. Details: - Walker wraps both ValueError and RuntimeError from xcif_ext.parse as CifParserError. - Heading prefixes ("data_", "save_", literal "global_") are restored before calling the builder. - error_count() and show_errors() tolerate self.parser=None. Test: xcif/regression/tst_iotbx_cif_reader_parity.py parses a corpus (7 inline fixtures + 3 real files) with both engines and asserts structural equality of the resulting iotbx.cif.model. Uses the engine= kwarg directly, so it's safe under parallel test runners.

CIF 1.1 §2.2.7.4 requires the closing delimiter of a semicolon text field to appear as the first character of a line. xcif's tokenizer previously returned a partial TOKEN_VALUE on EOF and let the parse succeed — silently accepting content where the user may have mistaken a mid-line `;` for a terminator. read_semicolon_field now throws CifError when the input ends before a column-1 `;` is found. The existing "graceful" regression (tst_error_handling.cpp::test_unclosed_semicolon_field) is flipped to assert the raise. New regression fixture xcif/regression/cpp/tst_semicolon_field_strict.cpp and xcif/regression/tst_semicolon_field_strict.py cover: EOF mid-field, EOF on the opening-`;` line, a trailing mid-line `;` mistaken for a terminator (the shape that prompted this fix), plus several well-formed positive cases.

… fixes): xcif: align strict-mode enforcement with ucif Four related changes to make xcif behave like ucif when used behind iotbx.cif.reader — required so iotbx.cif tests that exercise syntax errors pass under engine="xcif". - xcif Parser rejects `global_` block headers in strict mode with CifError. `global_` is a STAR/DDL2 reserved word, not part of strict CIF 1.1; ucif treats it as a strict-only rejection. The cctbx monomer library (mon_lib_list.cif and friends) uses `global_` and loads with strict=false. - xcif Tokenizer raises CifError on unterminated single- or double-quoted strings instead of silently returning the partial content (see the grammar in the CIF 1.1 syntax spec at https://www.iucr.org/resources/cif/spec/version1.1/cifsyntax). Two existing regression tests (test_unterminated_{single,double}_quote_returns_rest and test_unterminated_quoted_string_graceful) are flipped to assert the raise. - iotbx/cif/__init__.py: - `_drive_builder_from_xcif` takes a strict= argument and passes it to xcif_ext.parse(strict=). reader(..., strict=False) now actually relaxes xcif, not just ucif. - The walker drops `global_` blocks entirely (matching ucif/cif.g: "global blocks are ignored"), so str(model) of a file with a leading `global_` plus a data_ block shows only the data_ block, same as ucif. - `_drive_builder_from_xcif` now returns a list of error messages rather than raising, so `reader(raise_if_errors=False)` gets a populated error_count() instead of an exception — matching ucif's error-accumulation contract. - Tests in xcif/regression/tst_non_strict_mode (C++ and Python) that previously asserted `global_` works in strict mode are updated to assert the opposite (strict rejects, strict=false accepts), plus a new test for the strict rejection.

Replace the invented "§2.2.7.3 / §2.2.7.4" references with the actual paragraph numbers from the IUCr CIF 1.1 syntax specification (https://www.iucr.org/resources/cif/spec/version1.1/cifsyntax): - Paragraph 15 — quoted strings may embed their delimiter provided the delimiter is not followed by whitespace; the closing `'`/`"` is only recognised when followed by whitespace or end-of-input. - Paragraph 17 — a semicolon text field is delimited by `;` appearing as the first character of a line.

xcif's parse_loop raised "loop value count is not a multiple of tag count" with only a source:line:col prefix. For large CIF files with many similar loops, line number alone is not enough to locate the bad loop — ucif's "Wrong number of data items for loop containing _TAG" names the first tag, which is greppable. Message now includes the first tag of the offending loop plus the actual value/tag counts, e.g.: loop containing tag '_atom_site.group_PDB' has 57 values, not a multiple of its 18 tags Existing regression (tst_error_handling.cpp) tightened from a bare "multiple" substring check to assert the tag name and both counts are present.

Three tests in iotbx/cif/tests/tst_model_builder.py (test_repeated_loop_1, test_repeated_loop_2, test_repeated_tags) and one in iotbx/pdb/tst_pdb.py (exercise_pdb_input_error_handling) asserted the exact error strings emitted by ucif's cif_model_builder at model-build time. xcif rejects the same malformed inputs at parse time with different wording, so these tests break when iotbx.cif.reader runs with engine="xcif". Replace the strict equality checks with explicit both-wording assertions — both messages listed verbatim in an `accepted` tuple, the xcif branch matched with endswith(": " + m) to allow for the "<source>:line:col: " prefix xcif error translators add. Both wordings are preserved in the source so a future reader can see exactly what each parser says, and either parser's wording can change deliberately with a loud test failure rather than silently regressing.

SHELX and several other tools append a Ctrl-Z / SUB byte (0x1A) to .hkl and .cif files they write. CIF 1.1's character set doesn't include it, but ucif silently treats it as end-of-file, so real files parse cleanly. xcif was tokenising it as a stray unquoted value, which pushed the enclosing loop's value count one past a multiple of its tag count and triggered a spurious loop containing tag '_refln_index_h' has N values, not a multiple of its M tags error on otherwise well-formed SHELX-style output. Tokenizer::next() now checks for 0x1A after skip_whitespace_and_comments, consumes the rest of the buffer, and returns TOKEN_EOF. Any content after 0x1A is ignored — matching DOS-era semantics and what ucif does in practice.

- Add "[skip cache]" to the commit message to tell GitHub Actions to not use the build cache in the quick tests jobs - A new cache will be created with the new build

bkpoon

Code Review

Verdict: Ready to merge as opt-in — a few items should be resolved before the follow-up PR that flips DEFAULT_ENGINE to "xcif".

Because DEFAULT_ENGINE = "ucif" is preserved, merging this PR today ships zero behavior change to the ~120 existing iotbx.cif.reader(...) call sites. The items below are mostly scoped at the default-flip, not this merge.

Strengths

Default preserved (iotbx/cif/__init__.py:41); opt-in via engine="xcif" is the right shipping path for a dual-parser change.
Walker docstring on _drive_builder_from_xcif (iotbx/cif/__init__.py:48-81) is load-bearing — explains strict/non-strict semantics, the data_/save_ prefix round-trip to match the builder's [find('_')+1:] logic, and the ordering limitation.
Real parity test at xcif/regression/tst_iotbx_cif_reader_parity.py drives engine="xcif" against ucif on 7 inline fixtures plus file fixtures, comparing block names, pair items, loop tag sets and column content, and save-frame structure.
Non-trivial correctness fixes bundled in:
- CIF 1.1-compliant embedded-delimiter handling (xcif/tokenizer.cpp:118-142) — fixes the N-METHYL-PYRIDOXAL-5'-PHOSPHATE monomer-library case.
- Unterminated-quote / semicolon now throws (previously returned partial values silently).
- global_ recognition for mon_lib files; DOS 0x1A EOF compat for SHELX output.
- Locale fix via strtod_l / _strtod_l behind a C-locale handle (xcif/numeric.cpp:12-27), thread-safe static init.
- Integer parsing via strtol with explicit overflow check (xcif/numeric.cpp:63-80); old hand-rolled loop silently overflowed at INT_MAX. Paired with INT_MAX/INT_MIN tests in tst_numeric.cpp.
- string_view bounds now assertable in debug (xcif/include/xcif/string_view.h).
GIL released during parse/parse_file in xcif/xcif_ext.cpp:83-100; Python walker runs after heavy C++ work with GIL reacquired.

Issues

Critical

None.

Important (should be resolved before flipping the default)

Builder contract fragility with save frames. iotbx/cif/__init__.py:102-103 synthesises the save-frame heading as "save_" + sf.name. This round-trips only because iotbx/cif/builders.py:68-72 strips it back with save_frame_heading[save_frame_heading.find('_')+1:]. Third-party builder objects passed via reader(builder=...) that string-compare the full save_<name> token will silently diverge between engines. At minimum, document the builder contract on the builder= parameter.
Pair/loop source order not preserved by xcif. The walker explicitly emits all pairs before all loops within a block (iotbx/cif/__init__.py:87-94) because xcif::Block stores pairs_ and loops_ in separate vectors. ucif preserves source order. The parity test's _compare_block_like (xcif/regression/tst_iotbx_cif_reader_parity.py:159-178) compares via dict(blk_u._items), so order is discarded and CI will not catch this. block.show() output will change once the default flips, which makes file diffs noisy and can break snapshot tests downstream.
- Fix (minimum): tighten the parity test to assert list-order equivalence. Then either store original order in xcif and honour it in the walker, or flag this as a known behavioral difference in release notes.
raise_if_errors=False semantics diverge. On parse failure, _drive_builder_from_xcif returns before the walker touches the builder at all (iotbx/cif/__init__.py:67-74) — xcif yields an empty self.model(). ucif accumulates errors while partially populating the model. Callers that only check error_count() == 0 are unaffected, but callers trying to salvage a partial model from a damaged file behave radically differently. Should at least be documented in the engine= docstring.
No docstring on reader.engine. The reader class has no class docstring at all; DEFAULT_ENGINE has a one-line comment but doesn't document runtime-override semantics. Nothing in sphinx/iotbx/iotbx.cif.rst is updated either. For a switch that affects ~120 call sites, this is inadequate.
Zero-copy parse_file path is not used. When file_path= is given, reader reads the file into a Python string (iotbx/cif/__init__.py:127-130) then calls xcif_ext.parse. xcif has a parse_file entry point that memory-maps and parses directly. This is plausibly why the headline speedup is 6× through iotbx.cif.reader but 21× for the raw parser on the same 34 MB file. Suggest dispatching to parse_file when file_path is given and engine is xcif; binary-detection only reads the first few KB so it remains cheap.
Engine validation happens after I/O. iotbx/cif/__init__.py:134-140 validates engine only after the file has been read and binary-detected. A typo'd engine forces needless I/O before ValueError. Easy to move the check to the top of __init__.
DEFAULT_ENGINE thread-safety undocumented. Reader reads DEFAULT_ENGINE at construction time (line 137), so a concurrent test mutating it during another test's parse races. Unlikely given Phenix's spawn/fork model but worth noting in the module docstring. Alternatively, freeze it to "ucif" and expose only the per-call kwarg until the flip — making the flip literally a one-line commit as described.

Minor

_VALID_ENGINES = ("ucif", "xcif"): fine for 2 items; a dict mapping engine → walker factory scales better when a third option lands.
Walker catches (ValueError, RuntimeError) at iotbx/cif/__init__.py:68. If a std::exception subclass ever escapes without a custom translator, it would bypass _xcif_errors and the raise_if_errors=False contract. Defensively broadening the except with a class filter is cheap insurance.
xcif error messages prefixed "<source>:line:col: " vs ucif's plain messages. Test updates already accommodate both wordings (iotbx/cif/tests/tst_model_builder.py:34-52, iotbx/pdb/tst_pdb.py:718-728) — worth calling out in release notes for anyone grepping logs for error patterns.
.gitignore adds CLAUDE.md / */CLAUDE.md. Tooling-specific patterns like this are better in ~/.gitignore_global or .git/info/exclude rather than the shared project .gitignore. Trivial nit.
Two-dot diff shows mmtbx/ligands/rdkit_utils.py and mmtbx/validation/validate_ligands.py as deletions; this is a rebase artifact (branch is behind master by 3077f4bc11 and f37c1b9f65). Not a real issue, but rebasing onto master would clean the diff view.
tags = list(xloop.tags) at walker — minor allocation, not worth fixing.
xcif/regression/example.cif was modified to add _chem_comp / _entity / _atom_type stubs with matching tst_example_tokenize.py golden-count updates. Verify no other cctbx modules reference this fixture.

Recommendations

Keep the merge two-step. Ship this PR as opt-in, then flip the default in a separate one-line PR with a bake-in window. Do not squash — makes any regression cleanly attributable to the flip and trivially revertible.
Add a CI matrix job that runs the cctbx test suite under DEFAULT_ENGINE="xcif". The "whole test suite passes" claim is currently author-local and will decay with each new PR that lands. An env-var-honoring path in iotbx/cif/__init__.py is enough.
Strengthen the parity test to fail on source-order mismatches (see item 2).
Document semantic differences in the engine= docstring: partial-model-on-error, error-wording prefixes, pair/loop source order, file-read path performance characteristics.
Consider extracting the walker to iotbx/cif/_xcif_walker.py. It's ~40 lines of non-trivial Python glue in a module that already carries a lot.

Assessment

Merging this PR today is low-risk because the default path is untouched. Items 1–4 and the CI-matrix recommendation are the blocking work for the subsequent default-flip PR; 5 is a sizeable perf win left on the table; 6–7 are small-effort hardening.

Review assisted by Claude Code.

Block now records a (kind, index) log of entries in the order they appeared in the source. The iotbx.cif walker iterates this log so that str(model) output through engine="xcif" matches the ucif engine when a block interleaves pair items, loops, and save frames. The parity regression fixture now interleaves pairs and loops and asserts list-order equality of block._set (the OrderedSet that drives block.show() / str(model) serialization), catching pair/loop interleave mismatches that a separate _items vs .loops comparison would miss.

Move the engine check to the top of reader.__init__. A typo'd engine name now raises ValueError immediately instead of after smart_open has already read the file (or raised IOError on a missing path). Adds tst_iotbx_cif_reader_engine.py to xcif/regression with two exercises: engine validation short-circuits before any file open, and both engines parse a trivial fixture.

When the reader is given a plain (uncompressed) file_path and engine="xcif", dispatch to xcif_ext.parse_file (memory-mapped in C++) instead of reading the file into a Python string and calling xcif_ext.parse. Avoids a whole-file str() allocation on the hot path. Binary detection runs via detect_binary_file.from_initial_block on the first 1000 bytes, preserving the "Binary file detected" abort. Compressed files (.gz/.Z/.bz2) and file_object= callers continue to use the existing read-into-string path; parse_file cannot mmap a compressed stream or an open file object. Walker logic factored out of _drive_builder_from_xcif into a shared _walk_xcif_doc helper; both parse/parse_file wrappers reuse it. Tests verify file_path + xcif invokes parse_file (not parse) and that the resulting model matches the input_string path.

Adds a class docstring to reader covering: - what ucif and xcif are, and why ucif is the default - engine= behavioral divergences: error-message prefixes, partial-model-on-error (ucif multi-error + partial state, xcif single-error + empty model), pair/loop source-order guarantee, and the xcif file_path fast-path / compression fallback - the builder= contract re: the "data_" / "save_" prefix — third-party builders comparing full heading tokens must handle the prefix, since the default cif_model_builder strips it - raise_if_errors / strict semantics Expands the DEFAULT_ENGINE module-level comment to match. No behavior change.

olegsobolev · 2026-04-17T15:22:02Z

Thanks for the thorough review — I've addressed all items 1–7 plus a few of the minor ones. Summary, with commit references:

Item 2 (source order) — Fixed in bf2e49e. Block now records an ordered (kind, index) log; the iotbx.cif walker iterates it. Parity test tightened to compare block._set as a list, so a regression is caught in CI.
Item 5 (zero-copy parse_file) — Fixed in 2a8df9d. engine="xcif" with a plain uncompressed file_path now dispatches to xcif_ext.parse_file (mmap). Compressed files (.gz/.Z/.bz2) and file_object= callers keep the read-into-string path. Binary detection runs via detect_binary_file.from_initial_block on the first 1000 bytes.
Item 6 (engine validation) — Fixed in b1528af. Validation moved to the top of reader.__init__; a typo'd engine raises ValueError before any file I/O.
Items 1, 4, 7, 10 (docstrings & semantics) — Done in fbe030e. Class docstring on reader covers the ucif/xcif divergences (error-message prefix, partial-model-on-error, source order, fast-path). The builder= doc pins the data_/save_ prefix contract. Module comment clarifies DEFAULT_ENGINE is read at construction time.
Item 12 (rebase artifact) — Rebased onto current master; the mmtbx deletion noise is gone.

On item 3 (partial-model-on-error) — I went with "document loudly" rather than unify. The unification route needs a parallel non-throwing C++ API (parse_lenient/parse_file_lenient plus Document::parse_errors_) and would be ~100 lines of new public surface. No current cctbx caller uses raise_if_errors=False to salvage partial state — failures are either raised or error-count-checked and surfaced, not mined. The divergence is now called out explicitly in the engine= docstring so anyone hitting it knows why. Happy to add the lenient variants in a follow-up if you disagree.

Deferred (happy to take them in follow-ups or here if you prefer):

Items 8, 9, 11, 14 — all minor.
CI matrix job running the suite under DEFAULT_ENGINE="xcif". Best done as its own PR since it touches .github/ infra rather than the library.

Ready for another pass.
Edit: Going to merge as it is now.

bkpoon

Follow-up Review (delta)

Verified the four new commits against the Important items from my previous review. Scope: only what changed since f5099ea.

Item-by-item verification

Item 1 — builder contract re: data_/save_ prefix (fbe030ed): ✅ Docstring under builder= now explicitly states that headings are passed with the prefix and that the default strips-before-first-underscore.
Item 2 — pair/loop source order (bf2e49ee): ✅ xcif::Block::source_order_ records insertion order; walker iterates it. The parity test fixture now interleaves pair-loop-pair-loop-pair and _compare_block_like asserts list(blk_u._set) == list(blk_x._set) before value comparison — the pre-fix bug would fail the new test.
Item 3 — raise_if_errors=False semantics (doc-only decision): ✅ with one caveat worth noting. The docstring explicitly directs salvage-semantics callers to engine="ucif", which makes the decision defensible. One in-tree caller actually does mine post-parse state: mmtbx/chemical_components/cif_parser.py:259 iterates cm.items() after cif.reader(filename, strict=False, raise_if_errors=False).model(). Unaffected as long as DEFAULT_ENGINE="ucif"; worth remembering when the default flips so that this caller is either (a) migrated to a lenient-xcif variant, or (b) pinned to engine="ucif". The only two other raise_if_errors=False callers in the repo are a unit test and cctbx/web/iotbx_cif_validate.py:28, so the exposure is narrow.
Item 4 — reader docstring (fbe030ed): ✅ Class now carries a full docstring covering all kwargs + engine divergences.
Item 5 — zero-copy parse_file (2a8df9da): ✅ Fast path gated on engine=="xcif" AND file_object is None AND uncompressed plain path, with binary detection preserved. Walker factored into shared _walk_xcif_doc so both parse and parse_file paths funnel through identical source-order / error-translation logic. Tests monkey-patch xcif_ext.parse / xcif_ext.parse_file to verify dispatch.
Item 6 — validate engine= before I/O (b1528af0): ✅ Check moved to right after the file-source assertion. New regression (tst_iotbx_cif_reader_engine.py::exercise_engine_validated_before_io) uses a guaranteed-nonexistent path to prove the check short-circuits before smart_open.
Item 7 — DEFAULT_ENGINE thread-safety: ⚠️ Read-at-construction is now documented in the module comment; an explicit "mutating concurrently is a data race" caveat is not. Low risk in practice (GIL + Phenix's fork-based multiprocessing); acceptable as is.

Minor observations (non-blocking)

Fast-path error messages for missing files now come from xcif rather than smart_open's Sorry("Cannot find..."). Callers grepping logs for the Sorry wording would see a different string under engine="xcif". Not covered by the docstring's error-prefix note.
Per-block list allocations in the walker (pair_tags, pair_values, loops, save_frames) — O(items-per-block), dwarfed by the parse_file win.

Verdict

Ready to merge. All Important items are addressed in substance; item 7 is partially addressed but low-risk for an opt-in engine. The new tests would catch regressions of items 2, 5, and 6. No new bugs introduced by the four follow-up commits.

The mmtbx/chemical_components/cif_parser.py caller noted above and a CI matrix job running the suite under DEFAULT_ENGINE="xcif" are the natural follow-up items alongside flipping the default.

Delta review assisted by Claude Code.

bkpoon · 2026-04-17T21:26:25Z

LGTM. One other comment is that when the switch is made to xcif, does the run function in mmtbx/chemical_components.py need any updates?

olegsobolev and others added 21 commits April 15, 2026 12:49

Ignore claude.md

611321f

Integer overflow fix

5634520

guard against empty view

b5f666a

fix cctbx#3

ba8c593

fix cctbx#4

8dedc1f

Adjust mmCIF example to conform dictionary

e655dac

xcif: update example.cif golden counts

661a7a0

example.cif was enriched in df37a0d ("Adjust mmCIF example to conform dictionary") — gaining extra loops and tags — but the tokenize/parse demo tests were left asserting the old counts.

Clean clutter

bc6edc6

CI: add [skip cache] check to quick tests

1e4eb5b

- Add "[skip cache]" to the commit message to tell GitHub Actions to not use the build cache in the quick tests jobs - A new cache will be created with the new build

bkpoon reviewed Apr 16, 2026

View reviewed changes

olegsobolev force-pushed the cif-reader branch from f5099ea to bfe2db0 Compare April 16, 2026 23:06

olegsobolev added 3 commits April 16, 2026 16:11

olegsobolev force-pushed the cif-reader branch from bfe2db0 to fbe030e Compare April 16, 2026 23:12

bkpoon reviewed Apr 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xcif parser now wired to the iotbx.cif.reader #1149

xcif parser now wired to the iotbx.cif.reader #1149
olegsobolev wants to merge 25 commits intocctbx:masterfrom
olegsobolev:cif-reader

olegsobolev commented Apr 15, 2026

Uh oh!

bkpoon left a comment

Uh oh!

olegsobolev commented Apr 17, 2026 •

edited

Loading

Uh oh!

bkpoon left a comment

Uh oh!

bkpoon commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

olegsobolev commented Apr 15, 2026

Uh oh!

bkpoon left a comment

Choose a reason for hiding this comment

Code Review

Strengths

Issues

Critical

Important (should be resolved before flipping the default)

Minor

Recommendations

Assessment

Uh oh!

olegsobolev commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bkpoon left a comment

Choose a reason for hiding this comment

Follow-up Review (delta)

Item-by-item verification

Minor observations (non-blocking)

Verdict

Uh oh!

bkpoon commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

olegsobolev commented Apr 17, 2026 •

edited

Loading