Skip to content

Accumulated changes from Jules#315

Merged
jrfaeder merged 221 commits into
RuleWorld:masterfrom
akutuva21:master
Jun 2, 2026
Merged

Accumulated changes from Jules#315
jrfaeder merged 221 commits into
RuleWorld:masterfrom
akutuva21:master

Conversation

@akutuva21
Copy link
Copy Markdown
Member

Accumulated changes from Jules

This branch carries 82 commits ahead of upstream bionetgen:master, covering security hardening, correctness fixes, performance work, one new feature, expanded test coverage, and substantial dead-code cleanup. Net tree is clean — intermediate junk files added across the series are removed by the final two commits.

Security

  • Command-injection fix in BNGOutput.pm — eliminates unsafe shell call to dot.
  • Unsafe shell execution of dot in ContactMap replaced with a safe invocation.
  • Hardened system() calls in BNGModel.pm's SBML translator path: uses indirect-object form system { $cmd[0] } @cmd to bypass shell interpretation on all platforms, and checks $? for -1 (exec failure), & 127 (signal death), and >> 8 (non-zero exit).
  • Removed dynamic eval-of-string code execution.

Correctness fixes

  • SpeciesGraph::isomorphicTo corrected to compare graphs properly.
  • Per-component handling of add-bond / delete-bond operations.
  • Compartment handling: bond validation in species compartment assignment, component-level compartment verification, refactor of inferSpeciesCompartment, and a requirement that all compartments be specified.
  • SBML-Multi species-type component indexing fix; refactor of SBMLMultiAux.pm to build the mapF target hash correctly.
  • Edge canonicalization in RxnRule (using product instead of reactant), canonical form enforced for edges added to new molecules.
  • Removed redundant mol_index increment in RxnRule.
  • MultScale logic corrected for child rules in RefineRule.pm.
  • Pre-compilation of local functions restored.
  • Removed zeroing of observable weights in groups parsing.
  • Verification that actE is independent of deltaG.
  • Graceful TFUN evaluation outside NFsim contexts.
  • Species output filtered to fully-specified bonds and states.
  • Deprecated the Fixed reactant feature in RxnRule.

Performance ("Bolt" series)

  • Replaced copy.deepcopy with an explicit .copy() method on Trace (~17× faster in the TraceStack expansion hot loop) and across the Python AST parsers.
  • Wildcard-resolution optimizations in collapsedContactMap, extractAtomic.py, and ContactMap.solveWildcards.
  • Optimized component lookups and eliminated an unnecessary method call.
  • f-string conversions in Rule string formatting and several other hot paths.

New feature

  • Raw sensitivity coefficient output (.csc and .gsc) for the LinearParameterSensitivity action, implemented in both the Perl driver (BNGAction.pm) and the C++ engine (ActionDispatch.cpp):
    • Reads base and bumped simulation data, computes sensitivities, and transposes the matrices so species index runs down the columns.
    • Handles .cdat correctly when the file lacks a # header (C++ engine output), recovering the t=0 row that was previously dropped.
    • Fixes a std::to_string bug where atol/rtol formatted as "0.000000" and crashed CVODE — now uses std::scientific.
  • New mfile, mexfile, and mdl writers in BNGModel.pm.

Test additions

  • AST Model class, evaluateExpressionString (also surfaced and fixed several string-parsing bugs), ActionDispatch utilities, edge cases in solveWildcards, openGML in readGML.py, bngl2xml timeout error path, parseComponent in readBNGXML.py, bipartite error-handling path, getMaps in bpgMaps.py, parser correctness, and the C++ NetWriter (which also got a small API refactor).

Refactors and cleanup

  • Simplified newNumOrVar, removed unused local-variable check logic.
  • Removed dead and commented-out code across Rxn.pm, Expression.pm, SpeciesGraph.pm, rateHill.cpp, network.cpp, run_network.cpp, Param.pm, and RxnRule.pm.
  • Removed several outdated or out-of-scope TODOs (expression simplifier, FunctionProduct parsing, code-health item in RxnRule.pm, vague comment in Param.pm, etc.).
  • Cleaned up commented-out bounds checks and obsolete variables in rateHill.cpp.
  • .gitignore updated to exclude the .jules/ agent-log directory.

Pre-merge notes

A few things to be aware of before merging — none affect the final tree, but they will be visible in the commit list:

Duplicate commits from rebase / cherry-pick artifacts. Several logical changes appear multiple times in the history:

  • BNGModel.pm system() hardening — four copies (the second through fourth also remove validation XML outputs that were accidentally committed earlier).
  • Per-component add-bond / delete-bond fix — four copies.
  • Component compartment verification — five copies.
  • "Filter species output to fully specified bonds and states" — five copies.
  • SBML-Multi component indexing fix — two copies.
  • network.cpp dead-code removal — two copies.
  • f-string conversions — two copies.

Already-merged work. Many of these commits originated as individual upstream pull requests that may already be in master. A clean rebase against current upstream will drop anything already merged and shrink the branch substantially.

Recommendation. Either interactive-rebase to collapse the duplicates and drop the .jules/ artifacts, or squash-merge so upstream history doesn't carry the intermediate noise.

akutuva21 and others added 30 commits May 8, 2026 21:12
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
…on parsers (#199)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
- Extracted stream output logic in `NetWriter::write` to a new `std::ostream` overload.
- Added test file `tests/test_net_writer.cpp` to programmatically build an `ast::Model` and `GeneratedNetwork` and assert the `.net` formatted output.
- Linked new test via `Catch2` in `tests/CMakeLists.txt`.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Removed an empty, un-actionable TODO block from `bng2/Perl2/Console.pm` that asked about inferrable items that were either already implemented or too ambiguous to act upon.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
The `toString` method in `bng2/Perl2/Param.pm` returns an empty string when
`$param->Expr` is undefined. This is the intended behavior and is consistent
with other serialization methods in the class (like `toXML`, `toCVodeString`,
etc.). The `# TODO: ??` comment was unnecessary and confusing, so it has been
removed. No logical changes were made.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Removed a stale `# TODO: missing increment here` comment from `bng2/Perl2/RxnRule.pm`. The increment `++$mol_index;` is already present on line 985 and perfectly matches the equivalent logic for the `$p == 0` case.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
- Removed outdated `# TODO: handling for type "FunctionProduct"` at `bng2/Perl2/RateLaw.pm` line 375, since `FunctionProduct` is already properly processed upstream in the string parser and fully implemented in `evaluate_local`.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
The `convertToDouble`, `convertToInt`, and the two `toString` methods in `bng2/Network3/src/util/conversion.cpp` all contained trailing commented-out code (`// throw MoMMA::HiveException(...)`).

These have been removed to improve code readability and health.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Replaced O(N^2) nested list comprehensions inside the outer loop of `solveWildcards`
with an inverted index `molecule_to_atomics` map. This flattens the time
complexity to O(N) by precomputing matches.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
💡 What: Replaced nested loops and redundant list comprehensions in `solveWildcards` with a reverse index (`name_to_atomics`) and direct lookups.
🎯 Why: The original nested loop implementation had an O(W * A * M) complexity, re-evaluating inner comprehensions on every iteration. This optimization reduces the complexity to O(W + A * M), avoiding unnecessary operations.
📊 Impact: Significant reduction in time complexity during wildcard extraction.
🔬 Measurement: Benchmarked against simulated `atomicArray` datasets. The optimized function executes in ~0.012s compared to ~1.00s for the original implementation on a test set (1000 wildcards, 2000 atomics), yielding an ~80x speedup.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
)

Replaced an O(N*M) nested loop pattern with an inverted index mapping molecule names to their corresponding atomics. This reduces wildcard resolution time complexity to O(N+M).

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
When `TFUN` or `tfun_placeholder` were evaluated in the Perl engine outside of the supported NFsim simulator, they previously crashed the process with an `exit 1;` call. This caused simulation failures when utilizing ODE or SSA engines on models containing these functions.

This change modifies both functions to log a standard warning message and gracefully return a fallback value (the first argument, mirroring the existing C++ engine's stub behavior), allowing the simulation to continue safely.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
…213)

Added validation in RateLaw.pm to ensure that the activation energy
parameter of an Arrhenius rate law does not depend on an energy pattern
parameter (deltaG). Refactored the conditional block to concisely
combine the parameter lookups.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
#220)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Removes the temporary zeroing of `@{$obs->Weights}` when parsing `groups` blocks in `bng2/Perl2/BNGModel.pm`. Zeroing incorrectly wiped dynamically accumulated weights and is unnecessary since the arrays safely auto-extend, leaving skipped indices as `undef`. Downstream logic expects and safely handles `undef`.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
- Removed TODO related to MultScale handling
- Scaled RateLaw->Factor for the child rule by the ratio of the parent's MultScale to the child's MultScale.
- Added a safeguard to prevent division by zero by verifying `$child_rule->MultScale != 0`.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
This commit addresses the TODO in `ReactionAction.java` to perform bond addition and deletion logic on a per-component basis instead of defaulting to molecular-level checks that incorrectly process bond changes when multiple bonds are modified.
We've added an `extractBonds` helper to accurately extract individual bond identifiers from the component strings (stripping states, compartments, and wildcards) and then comparing reactant and product component bonds to deduce actual `AddBond` and `DeleteBond` actions.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
… in RefineRule (#216)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
…#215)

* feat: implement raw sensitivity coefficient output for linearparametersensitivity action

- Implemented `.csc` and `.gsc` generation in the C++ engine (`src/actions/ActionDispatch.cpp`) by reading base and bumped simulation data, calculating sensitivities, and transposing the matrices (with species index going down the columns) as per instructions.
- Handled `.cdat` and `.gdat` parsing robustly, accounting for the fact that `.cdat` lacks a `#` header in C++ outputs.
- Fixed a bug where `std::to_string` evaluated `atol` and `rtol` to `"0.000000"`, causing `CVODE` failures, by formatting with `std::scientific`.
- Addressed a bug in the Perl engine (`bng2/Perl2/BNGAction.pm`) where `.cdat` lack of header was skipped as header instead of data, dropping `t=0`.

Co-authored-by: akutuva21 <44119804+akutuva21@users.noreply.github.com>

* Remove temp files from PR branch

---------

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
…teFile (#219)

The generic `writeFile` subroutine in `BNGModel.pm` was missing support
for exporting MATLAB (mfile), MEX (mexfile), and CellBlender/MCell (mdl)
formats. The `TODO` comment requesting additional formats has been addressed
by adding these missing options to `%allowed_formats` and setting up their
corresponding file extensions and `elsif` delegation branches.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
* Fix SpeciesGraph.pm to only include fully specified bonds and states

This filters the `<multi:listOfSpeciesTypeComponentIndexes>` output to only
include molecules that have a defined component state or an explicit edge/bond,
reducing unnecessary noise in the SBML-Multi representation as identified in
the TODO comment.

Co-authored-by: akutuva21 <44119804+akutuva21@users.noreply.github.com>

* Remove temp/patch files from PR branch

---------

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
…ns (#212)

This commit resolves a long-standing TODO in `SpeciesGraph::isomorphicTo` regarding pattern comparison. Previously, `Component::compare_local` and `SpeciesGraph::cmp_component` evaluated bond equivalency simply by comparing the total size of the `Edges` array. This inadvertently allowed components with differing wildcard bonds (e.g., `+` vs `?`) or components with wildcard bonds vs explicit bonds to be considered matched by `isomorphicTo`, breaking graph isomorphism symmetry.

The fix introduces explicit iteration and distinct counting for explicit bonds and each wildcard type (`+`, `?`, `*`) during component comparison and component sorting. This ensures that `isomorphicTo` correctly rejects patterns with unmatched wildcards or mismatched explicit-to-wildcard bonds. Validation suite confirms no regressions for existing test baselines.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
* Fix obsolete TODO comment and redundant mol_index increment in RxnRule.pm

Co-authored-by: akutuva21 <44119804+akutuva21@users.noreply.github.com>

* Restore accidentally deleted validation test files

---------

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
This commit updates `inferSpeciesCompartment` in `bng2/Perl2/SpeciesGraph.pm` to loop through and gather compartment assignments at the component level in addition to the molecule level. This ensures that component compartments are correctly verified during species inference.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
This patch resolves the TODO in `inferSpeciesCompartment` by adding a check that ensures partial compartment specification is caught as an error. If some molecules in a `SpeciesGraph` have explicitly specified compartments, and others do not (and there is no fallback `@C:` compartment defined for the entire graph), it will return an error indicating which molecule is missing a compartment. This prevents unassigned molecules from silently being forced into inferred compartments when they should explicitly declare one.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
akutuva21 and others added 29 commits May 31, 2026 11:13
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Removed the comment "Bug 2 fix: Also check for user-defined function references OUTSIDE the rateExpr block" in src/engine/OdeIntegrator.cpp. This comment was acting as a non-actionable marker for a past bug fix rather than an actual TODO item, and has been removed to clean up the codebase.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Removed the comment `// Detect Sat/MM/Hill rate law types` from `src/engine/OdeIntegrator.cpp` as it was a marker for a past bug fix and not an actionable TODO.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
…2x speedup

Replaces legacy dict([(k, v) ...]) and dict([[k, v] ...]) patterns with native {k: v ...} dictionary comprehensions across parsers/BipartiteGraph/bpgMaps.py, bpgModel.py, and bipartiteGraph.py. This eliminates unnecessary intermediate list allocation and function call overhead, yielding a >2x performance improvement during dictionary construction.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Adds missing logical and unary operators (`~=`, `!`, `~`) to the MathML
string conversion hash (`%ophash`) in Expression.pm, addressing old
TODO comments and preventing uninitialized value warnings during formatting.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
This commit adds a test suite section to check that exceptions thrown by
`std::stod` inside `evaluateExpressionString` are properly caught and
cause the parser to fall back correctly to parameter resolution or
erroring. This increases test coverage for these edge cases.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Removed the outdated comment `// Check for user-defined function references`
from `src/engine/OdeIntegrator.cpp` as it was a historical marker for a
past bug fix, rather than an actionable TODO item.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
* 🧹 [code health] Re-enable Automorphism count in SpeciesGraph toXML method

Re-enabled the serialization of `Automorphisms` attribute in `SpeciesGraph::toXML` method. The code was previously commented out. Re-enabling it makes the XML serialization complete for `SpeciesGraph` objects that have `Automorphisms` computed and avoids having dead commented-out code in the codebase, thus improving maintainability. Also updated the test snapshots to match the new output.

Co-authored-by: akutuva21 <44119804+akutuva21@users.noreply.github.com>

* 🧹 [code health] Re-enable Automorphism count in SpeciesGraph toXML method

Re-enabled the serialization of `Automorphisms` attribute in `SpeciesGraph::toXML` method. The code was previously commented out. Re-enabling it makes the XML serialization complete for `SpeciesGraph` objects that have `Automorphisms` computed and avoids having dead commented-out code in the codebase, thus improving maintainability. Also updated the test snapshots to match the new output.

Co-authored-by: akutuva21 <44119804+akutuva21@users.noreply.github.com>

---------

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Added a new Catch2 test file (`tests/test_pattern_graph.cpp`) to verify core operations, including node/edge addition/removal, splicing, reset indices, canonical checking, and embedding overlaps for the `PatternGraph` graph container. Updated CMakeLists.txt to register the new test suite.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Added Catch2 tests covering basic functionality, empty components,
multiple components with and without allowed states, and the
population flag behavior for the fundamental AST node MoleculeType.
Updated CMakeLists.txt to integrate the test into the build suite.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
This PR optimizes pattern parsing in the C++ export writers (`CppExportWriter.cpp`, `MatlabWriter.cpp`, `MexWriter.cpp`, `PythonExportWriter.cpp`, `RegulatoryGraphWriter.cpp`, `SbmlWriter.cpp`).

It introduces a `parsedObservableCache` (a `std::unordered_map`) to avoid re-parsing the exact same BNGL string patterns repeatedly for every species evaluated in the network export generation loops, resulting in a dramatic reduction of the O(N*M) redundant parsing overhead.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
… type

- Introduce an `rperr` flag in `network.cpp` to correctly validate the number of reactants and products based on the rate law type.
- Ensures Michaelis-Menten and Hill rate laws have exactly two reactants and two products.
- Checks that the Saturation rate law correctly has at least 1 reactant if there's a Km parameter (multiple tokens).
- Outputs an appropriate error message and increments the error counter if conditions are violated.

Co-authored-by: akutuva21 <44119804+akutuva21@users.noreply.github.com>
…s-11552922770666863453

Fix: Validate number of reactants and products for rate law types
@jrfaeder jrfaeder merged commit 2abaf75 into RuleWorld:master Jun 2, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants