Skip to content

Commit b14ab0a

Browse files
harmjeffclaude
andcommitted
feat(traceability): add aidlc-traceability matrix tool
Introduces the traceability matrix tool under scripts/aidlc-traceability/, including source, tests, docs, and requirements. Verified clean across Semgrep, Bandit, Checkov, gitleaks, Grype, and CodeQL (174 queries). Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
1 parent bc75a08 commit b14ab0a

48 files changed

Lines changed: 9169 additions & 0 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
2+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
3+
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
4+
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
5+
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
6+
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

scripts/aidlc-traceability/LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2026 AIDLC Traceability Tool Contributors
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
<!--
2+
SPDX-License-Identifier: MIT
3+
Copyright (c) 2026 AIDLC Traceability Tool Contributors
4+
-->
5+
6+
# AIDLC Traceability Matrix Tool
7+
8+
A Python CLI tool that generates comprehensive traceability matrices from AI-DLC (AI-Driven Development Life Cycle) project artifacts. Analyzes requirements, user stories, implementation units, components, and source code to produce detailed traceability reports.
9+
10+
## Features
11+
12+
- **Multi-Stage Pipeline Architecture**: 6-stage process from artifact discovery to report generation
13+
- **AI-Powered Relationship Discovery**: Optional multi-agent system using Amazon Bedrock for semantic analysis
14+
- **Smart Boilerplate Detection**: Language-independent detection of non-functional code (init files, test infrastructure, auto-generated code)
15+
- **Multiple Output Formats**: Generate markdown and HTML reports with dark mode and interactive features
16+
- **Gap Analysis**: Automatically detect orphaned artifacts and incomplete traces
17+
- **Coverage Metrics**: Calculate traceability coverage across all artifact types
18+
19+
## What It Does
20+
21+
The tool analyzes AI-DLC project artifacts and produces traceability matrices showing:
22+
23+
- Which requirements map to which user stories
24+
- Which stories are implemented by which units
25+
- Which units correspond to which design components
26+
- Which components are realized in which source files
27+
- Coverage gaps and orphaned artifacts
28+
29+
### Artifact Types Supported
30+
31+
- **Requirements**: Business requirements from `requirements.md`
32+
- **User Stories**: From `stories.md`
33+
- **Implementation Units**: From `units-breakdown.md`
34+
- **Design Components**: From `application-components.md`
35+
- **Code Plans**: From `code-plan.md`
36+
- **Source Code**: Actual implementation files
37+
- **Tests**: Test files (tracked separately)
38+
39+
## Installation
40+
41+
```bash
42+
# Clone the repository
43+
git clone <repository-url>
44+
cd AIDLC-Traceability
45+
46+
# Install in development mode
47+
uv sync
48+
```
49+
50+
**Requirements**: Python 3.12 or higher
51+
52+
## Quick Start
53+
54+
### Basic Usage
55+
56+
```bash
57+
# Generate traceability matrix with AI analysis (requires Amazon Bedrock access)
58+
traceability generate --input /path/to/aidlc-project --format markdown
59+
60+
# Generate without AI (faster, rule-based only)
61+
traceability generate --input /path/to/project --no-ai
62+
63+
# Generate both markdown and HTML reports
64+
traceability generate --input /path/to/project --format both
65+
```
66+
67+
### AWS Configuration (for AI Analysis)
68+
69+
The AI-powered analysis requires AWS credentials with Amazon Bedrock access. The minimum required IAM permissions are:
70+
71+
- `bedrock:InvokeModel`
72+
- `bedrock:InvokeModelWithResponseStream`
73+
- `sts:GetCallerIdentity` (for credential validation)
74+
75+
See [docs/bedrock-security.md](docs/bedrock-security.md) for a complete least-privilege IAM policy and credential management guidance.
76+
77+
```bash
78+
# Use specific AWS profile and region
79+
traceability generate --input /path/to/project --profile my-profile --region us-east-1
80+
81+
# Or use default AWS credentials
82+
export AWS_PROFILE=your-profile
83+
traceability generate --input /path/to/project
84+
```
85+
86+
### Advanced Options
87+
88+
```bash
89+
# Enable verbose logging
90+
traceability generate --input /path/to/project --verbose
91+
92+
# Get help
93+
traceability --help
94+
traceability generate --help
95+
```
96+
97+
## Architecture
98+
99+
### 6-Stage Pipeline
100+
101+
1. **Discovery**: Locate `aidlc-docs/` directory and categorize artifact files
102+
2. **Parsing**: Extract structured data from markdown files and source code
103+
3. **AI Analysis** (optional): Multi-agent semantic relationship discovery
104+
4. **Graph Building**: Construct NetworkX directed graph of relationships
105+
5. **Coverage Analysis**: Detect gaps and calculate metrics
106+
6. **Report Generation**: Render markdown or HTML reports
107+
108+
### Multi-Agent AI System
109+
110+
When AI analysis is enabled, the tool uses 4 specialized Strands agents:
111+
112+
- **Requirements → Stories Agent**: Maps business requirements to user stories
113+
- **Stories → Units Agent**: Traces user stories to implementation units
114+
- **Units → Components Agent**: Links units to design components
115+
- **Components → Code Agent**: Connects components to source files
116+
117+
Each agent is focused on a specific artifact pair, preventing context pollution and enabling parallel analysis.
118+
119+
## Project Structure
120+
121+
```
122+
AIDLC-Traceability/
123+
├── src/traceability/ # Main implementation
124+
│ ├── cli.py # Click-based CLI
125+
│ ├── pipeline.py # Pipeline orchestration
126+
│ ├── models.py # Pydantic data models
127+
│ ├── discovery.py # Artifact discovery
128+
│ ├── graph.py # NetworkX graph builder
129+
│ ├── analysis.py # Coverage gap detection
130+
│ ├── agent.py # Strands AI integration
131+
│ ├── parsers/ # Specialized parsers
132+
│ │ ├── requirements.py
133+
│ │ ├── stories.py
134+
│ │ ├── units.py
135+
│ │ ├── code_plans.py
136+
│ │ ├── components.py
137+
│ │ └── code.py # Code parser with boilerplate detection
138+
│ └── generators/ # Report generators
139+
│ ├── markdown.py
140+
│ └── html.py
141+
├── input-docs/ # Original specifications
142+
├── tests/ # Test suite
143+
└── pyproject.toml # Project configuration
144+
```
145+
146+
## Technical Stack
147+
148+
- **Language**: Python 3.12+
149+
- **CLI Framework**: Click
150+
- **Key Libraries**:
151+
- `pydantic` - Data validation and models
152+
- `networkx` - Graph construction and analysis
153+
- `strands-agents` - AI-powered relationship discovery
154+
- `boto3` - Amazon Bedrock integration
155+
- `jinja2` - HTML template rendering
156+
- `rich` - Terminal output formatting
157+
- **Linter**: Ruff (120 char line length)
158+
- **Test Framework**: pytest
159+
160+
## Development
161+
162+
```bash
163+
# Run linter
164+
ruff check src/
165+
166+
# Run tests (when implemented)
167+
pytest
168+
169+
# Install in editable mode
170+
uv sync
171+
```
172+
173+
## Output Examples
174+
175+
### Markdown Report
176+
177+
The markdown report includes:
178+
- Summary statistics (artifact counts, coverage percentages)
179+
- Complete traceability matrix showing all relationships
180+
- Gap analysis highlighting orphaned artifacts
181+
- Detailed artifact listings by type
182+
183+
### HTML Report
184+
185+
The HTML report provides:
186+
- Interactive dark mode toggle
187+
- Resizable sidebar for navigation
188+
- Collapsible sections
189+
- Syntax-highlighted code snippets
190+
- Visual coverage indicators
191+
192+
## AI-DLC Context
193+
194+
This tool is designed to analyze projects built using the AI-Driven Development Life Cycle (AI-DLC) methodology. AI-DLC projects typically maintain their artifacts in an `aidlc-docs/` directory with standardized markdown files.
195+
196+
## Disclaimer
197+
198+
This tool generates traceability documentation to support your development and compliance workflows. It does not provide legal, regulatory, or compliance advice, and does not guarantee compliance with any specific standard or regulation. Users are solely responsible for ensuring their projects meet applicable regulatory requirements. See [LEGAL_DISCLAIMER.md](LEGAL_DISCLAIMER.md) for full terms.
199+
200+
## License
201+
202+
This project is licensed under the [MIT License](LICENSE).
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
<!--
2+
SPDX-License-Identifier: MIT
3+
Copyright (c) 2026 AIDLC Traceability Tool Contributors
4+
-->
5+
6+
# AI Compliance Documentation
7+
8+
## GenAI Use Case Classification
9+
10+
| Attribute | Value |
11+
|-----------|-------|
12+
| **Use Case** | Development tooling — automated traceability analysis |
13+
| **Risk Level** | LOW |
14+
| **Domain** | Software engineering documentation |
15+
| **Decision Impact** | Advisory only — generates reports for human review |
16+
| **PII Processing** | None — tool processes code and documentation artifacts |
17+
| **Safety-Critical** | No — tool does not make health, financial, legal, or safety decisions |
18+
19+
### Risk Justification
20+
21+
This is a **low-risk** GenAI use case because:
22+
1. The AI generates suggested relationships between development artifacts (requirements, stories, code)
23+
2. All AI output is validated against known artifact IDs before inclusion in reports
24+
3. Reports are for informational and documentation purposes; no automated decisions are made
25+
4. Users review the generated traceability matrix and make their own compliance determinations
26+
5. The tool can operate without AI (`--no-ai`), making AI an optional enhancement
27+
28+
## Third-Party Model Usage
29+
30+
### Amazon Bedrock — Claude Sonnet
31+
32+
| Attribute | Value |
33+
|-----------|-------|
34+
| **Provider** | Anthropic (via Amazon Bedrock marketplace) |
35+
| **Model** | Claude Sonnet 4 (`us.anthropic.claude-sonnet-4-20250514-v1:0`) |
36+
| **Access Method** | Amazon Bedrock API (on-demand) |
37+
| **Data Retention** | None — Amazon Bedrock does not retain customer prompt/completion data |
38+
| **Training Data Usage** | None — customer data is not used for model training |
39+
40+
### Legal Approval and Right to Use
41+
42+
| Component | License/Terms | Approval Status |
43+
|-----------|--------------|-----------------|
44+
| **Claude Sonnet (via Amazon Bedrock)** | [AWS Service Terms](https://aws.amazon.com/service-terms/) — Amazon Bedrock section | Pre-approved: Amazon Bedrock marketplace models are available to all AWS customers with Amazon Bedrock access. No separate Anthropic license required. |
45+
| **Strands Agents SDK** (`strands-agents`) | Apache License 2.0 ([source](https://github.com/strands-agents/strands-agents)) | Pre-approved: Open-source, permissive license compatible with MIT. No usage restrictions or distribution limitations. |
46+
| **Strands Agents Tools** (`strands-agents-tools`) | Apache License 2.0 | Pre-approved: Same terms as strands-agents SDK. |
47+
| **boto3** (AWS SDK) | Apache License 2.0 | Pre-approved: Official AWS SDK, open source. |
48+
49+
**Organizational approval**: Users deploying this tool should verify that their organization's policies permit the use of Amazon Bedrock and the Claude model family. Many organizations pre-approve all Amazon Bedrock marketplace models under their AWS Enterprise Agreement.
50+
51+
## Third-Party Framework Usage
52+
53+
### Strands Agents SDK
54+
55+
| Attribute | Value |
56+
|-----------|-------|
57+
| **Package** | `strands-agents` |
58+
| **License** | Apache License 2.0 |
59+
| **Source** | Open source |
60+
| **Purpose** | Agent orchestration framework for Amazon Bedrock model invocation |
61+
| **Data Handling** | SDK passes prompts to Amazon Bedrock API; no independent data collection |
62+
63+
## Implemented AI Security Controls
64+
65+
The following security controls are implemented in `src/traceability/agent.py` and the pipeline:
66+
67+
| Control | Implementation | File:Line |
68+
|---------|---------------|-----------|
69+
| **Input isolation** | Each of 4 agents receives only its relevant artifact pair; no cross-agent data leakage | `agent.py:86-170` |
70+
| **Static system prompts** | System prompts are hardcoded strings; no user input is injected into system prompts | `agent.py:86-170` |
71+
| **Output format enforcement** | Agents are instructed to respond in JSON only; non-JSON responses are discarded | `agent.py:173-228` |
72+
| **Artifact ID validation** | All `source_id` and `target_id` values validated against known parsed artifact IDs | `agent.py:189-215` |
73+
| **Invalid relationship filtering** | Relationships referencing non-existent artifacts are silently discarded and counted | `agent.py:205-215` |
74+
| **Output sanitization** | AI-generated text is not rendered as raw content; only validated artifact IDs are used to create graph edges. Report generators escape all artifact content via `html.escape()` before rendering | `generators/html.py:116-117` |
75+
| **Graceful degradation** | Amazon Bedrock failures are caught; pipeline falls back to heuristic-only analysis | `pipeline.py:229-234` |
76+
| **Data volume limits** | Source code reading limited to 30 files, 200 lines each | `agent.py:50-65` |
77+
| **No code execution** | No `eval()`, `exec()`, or dynamic code execution of AI responses | Verified by Bandit scan |
78+
| **Configurable opt-out** | AI analysis is fully optional via `--no-ai` flag | `cli.py:26` |
79+
80+
For detailed technical documentation of these controls, see [docs/ai-security.md](ai-security.md).
81+
82+
## No Training Data Used
83+
84+
This tool does not:
85+
- Train or fine-tune any AI models
86+
- Create or manage training datasets
87+
- Store AI interaction data for future training
88+
- Use any third-party datasets beyond the user's own project artifacts
89+
90+
## Bias and Fairness Considerations
91+
92+
### Nature of AI Analysis
93+
94+
The AI agents perform **artifact relationship mapping** — connecting requirements to stories, stories to code, etc. This is a technical documentation task, not a decision-making task affecting individuals.
95+
96+
### Potential Bias Vectors
97+
98+
| Vector | Risk | Mitigation |
99+
|--------|------|-----------|
100+
| Naming bias | AI may favor artifacts with descriptive names over terse ones | Heuristic linker provides baseline; AI adds to it |
101+
| Language bias | Non-English artifact names may produce fewer matches | Not applicable — tool targets English-language AI-DLC projects |
102+
| Completeness bias | AI may over-connect well-documented artifacts, under-connect sparse ones | Gap analysis independently identifies unconnected artifacts |
103+
104+
### Fairness Assessment
105+
106+
The tool's AI analysis does not impact individuals, hiring, lending, healthcare, or other domains where fairness concerns typically arise. Its output is technical documentation reviewed by engineers.

0 commit comments

Comments
 (0)