Skip to content

Add fact parser validation, tests, and bulk fact loading with tests#100

Merged
ColtonPayne merged 29 commits intomainfrom
input-validation
Feb 10, 2026
Merged

Add fact parser validation, tests, and bulk fact loading with tests#100
ColtonPayne merged 29 commits intomainfrom
input-validation

Conversation

@ColtonPayne
Copy link
Copy Markdown
Collaborator

@ColtonPayne ColtonPayne commented Jan 21, 2026

Summary

This PR adds comprehensive input validation to the fact parser and implements bulk fact loading from CSV files with extensive test coverage.

Fact Parser Validation (fact_parser.py) (Issue #91 )

Bulk Fact Loading (pyreason.py)

  • Implemented add_fact_in_bulk() function to load facts from CSV files
  • Supports optional header row detection
  • Handles optional columns: name, start_time, end_time, static
  • Provides warnings for invalid data (malformed facts, invalid times, etc.) without crashing
  • Supports multiple boolean formats for static field (True/true/1/yes, False/false/0/no)

Test Coverage

  • 333 new lines in test_fact_parser.py:
    • Tests for valid fact parsing (node/edge facts, intervals, negation, etc.)
    • Tests for invalid inputs (missing parentheses, empty fields, invalid characters, etc.)
    • Edge cases and boundary conditions
  • 204 new lines in test_pyreason_file_loading.py:
    • Tests for bulk fact loading from CSV
    • Warning validation for invalid facts
    • Tests with/without headers, various static value formats
    • Error handling tests

Test Data

  • Created example_facts.csv with comprehensive test scenarios including both valid and invalid facts
  • Created example_facts_no_header.csv for testing CSV without headers

Implementation Notes

  • All validation errors raise ValueError with descriptive messages
  • Bulk loading continues processing even when individual rows fail (with warnings)
  • Predicate validation regex: ^[a-zA-Z_][a-zA-Z0-9_]*$ (follows Python identifier rules)

🤖 Generated with Claude Code

@ColtonPayne ColtonPayne added the AI PR contains AI Generated Code label Jan 21, 2026
@dyumanaditya dyumanaditya self-requested a review January 29, 2026 17:39
Copy link
Copy Markdown
Contributor

@dyumanaditya dyumanaditya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ColtonPayne Everything looks good except for my one comment on explicit bound inverses.

Comment thread pyreason/scripts/facts/fact.py Outdated
- `'pred@name(node)'` - invalid characters in predicate
- `'pred(node1,node2,node3)'` - more than 2 components
- `'pred(node):[1.5,2.0]'` - values out of range [0,1]
- `'~pred(node):[0.2,0.8]'` - negation with explicit bound
Copy link
Copy Markdown
Contributor

@dyumanaditya dyumanaditya Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

negation with explicit bound This should actually be allowed. The negation of an explicit bound is defined and has an explicit formula. Currently it is not supported but it needs to be.

The formula for the inverse of a bound [l, u] is: ~[l, u] = [1-u, 1-l]

Comment thread pyreason/pyreason.py
Comment thread pyreason/pyreason.py Outdated
Comment thread pyreason/pyreason.py Outdated
Comment thread tests/api_tests/test_files/example_facts.json
Copy link
Copy Markdown
Member

@kmukherji kmukherji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments are inline.

Comment thread pyreason/pyreason.py Outdated
Comment thread pyreason/scripts/utils/fact_parser.py
@ColtonPayne ColtonPayne added the Ready for Review Awaiting PR Review label Feb 4, 2026
ColtonPayne added a commit that referenced this pull request Feb 6, 2026
Implements add_rule_from_csv() and add_rule_from_json() for bulk rule
loading, following the same pattern as bulk fact loading from PR #100.
Updates add_rules_from_file() with raise_errors parameter for
backwards-compatible error handling. Adds comprehensive test coverage.

Resolves #117

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@kmukherji kmukherji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once these comments are addressed we can merge this branch.

Comment thread pyreason/scripts/facts/fact.py
Comment thread pyreason/scripts/utils/fact_parser.py
Comment thread pyreason/scripts/utils/fact_parser.py
Comment thread pyreason/pyreason.py Outdated
if raise_errors:
raise ValueError(f"{item_label} {idx}: Invalid end_time '{end_time_raw}'")
warnings.warn(f"{item_label} {idx}: Invalid end_time '{end_time_raw}', using default value")
end_time = 0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can resolve end_time errors by assigning end_time = start_time.
What do you think?

Copy link
Copy Markdown
Collaborator Author

@ColtonPayne ColtonPayne Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are currently handling invalid values for start_time and end_time by setting them to zero if the user passes a non-integer value. This is the default value provided by the Fact constructor.

 # Parse start_time
    try:
        start_time = int(start_time_raw) if start_time_raw is not None and str(start_time_raw).strip() else 0

If raise_errors is false, we handle this gracefully. If raise_errors is true, we will not accept an invalid json/csv and force the user to correct the bad input.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I am talking about the specific case where start_time is a valid non-zero integer. But end_time has an error. Should we give a warning and set it to start_time?

Comment thread pyreason/pyreason.py
Comment thread pyreason/pyreason.py
Comment thread pyreason/pyreason.py
@kmukherji kmukherji added Changes requested and removed Ready for Review Awaiting PR Review labels Feb 10, 2026
@ColtonPayne ColtonPayne merged commit 7594583 into main Feb 10, 2026
3 checks passed
@ColtonPayne ColtonPayne deleted the input-validation branch March 18, 2026 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI PR contains AI Generated Code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants