awslabs
diff --git a/‎scripts/aidlc-traceability/LEGAL_DISCLAIMER.md‎
Lines changed: 6 additions & 0 deletions b/‎scripts/aidlc-traceability/LEGAL_DISCLAIMER.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎scripts/aidlc-traceability/LICENSE‎
Lines changed: 21 additions & 0 deletions b/‎scripts/aidlc-traceability/LICENSE‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎scripts/aidlc-traceability/README.md‎
Lines changed: 202 additions & 0 deletions b/‎scripts/aidlc-traceability/README.md‎
Lines changed: 202 additions & 0 deletions
diff --git a/‎scripts/aidlc-traceability/docs/ai-compliance.md‎
Lines changed: 106 additions & 0 deletions b/‎scripts/aidlc-traceability/docs/ai-compliance.md‎
Lines changed: 106 additions & 0 deletions
@@ -0,0 +1,6 @@
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
+FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
+COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
+IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2026 AIDLC Traceability Tool Contributors
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
@@ -0,0 +1,202 @@
+<!--
+SPDX-License-Identifier: MIT
+Copyright (c) 2026 AIDLC Traceability Tool Contributors
+-->
+
+# AIDLC Traceability Matrix Tool
+
+A Python CLI tool that generates comprehensive traceability matrices from AI-DLC (AI-Driven Development Life Cycle) project artifacts. Analyzes requirements, user stories, implementation units, components, and source code to produce detailed traceability reports.
+
+## Features
+
+- **Multi-Stage Pipeline Architecture**: 6-stage process from artifact discovery to report generation
+- **AI-Powered Relationship Discovery**: Optional multi-agent system using Amazon Bedrock for semantic analysis
+- **Smart Boilerplate Detection**: Language-independent detection of non-functional code (init files, test infrastructure, auto-generated code)
+- **Multiple Output Formats**: Generate markdown and HTML reports with dark mode and interactive features
+- **Gap Analysis**: Automatically detect orphaned artifacts and incomplete traces
+- **Coverage Metrics**: Calculate traceability coverage across all artifact types
+
+## What It Does
+
+The tool analyzes AI-DLC project artifacts and produces traceability matrices showing:
+
+- Which requirements map to which user stories
+- Which stories are implemented by which units
+- Which units correspond to which design components
+- Which components are realized in which source files
+- Coverage gaps and orphaned artifacts
+
+### Artifact Types Supported
+
+- **Requirements**: Business requirements from `requirements.md`
+- **User Stories**: From `stories.md`
+- **Implementation Units**: From `units-breakdown.md`
+- **Design Components**: From `application-components.md`
+- **Code Plans**: From `code-plan.md`
+- **Source Code**: Actual implementation files
+- **Tests**: Test files (tracked separately)
+
+## Installation
+
+```bash
+# Clone the repository
+git clone <repository-url>
+cd AIDLC-Traceability
+
+# Install in development mode
+uv sync
+```
+
+**Requirements**: Python 3.12 or higher
+
+## Quick Start
+
+### Basic Usage
+
+```bash
+# Generate traceability matrix with AI analysis (requires Amazon Bedrock access)
+traceability generate --input /path/to/aidlc-project --format markdown
+
+# Generate without AI (faster, rule-based only)
+traceability generate --input /path/to/project --no-ai
+
+# Generate both markdown and HTML reports
+traceability generate --input /path/to/project --format both
+```
+
+### AWS Configuration (for AI Analysis)
+
+The AI-powered analysis requires AWS credentials with Amazon Bedrock access. The minimum required IAM permissions are:
+
+- `bedrock:InvokeModel`
+- `bedrock:InvokeModelWithResponseStream`
+- `sts:GetCallerIdentity` (for credential validation)
+
+See [docs/bedrock-security.md](docs/bedrock-security.md) for a complete least-privilege IAM policy and credential management guidance.
+
+```bash
+# Use specific AWS profile and region
+traceability generate --input /path/to/project --profile my-profile --region us-east-1
+
+# Or use default AWS credentials
+export AWS_PROFILE=your-profile
+traceability generate --input /path/to/project
+```
+
+### Advanced Options
+
+```bash
+# Enable verbose logging
+traceability generate --input /path/to/project --verbose
+
+# Get help
+traceability --help
+traceability generate --help
+```
+
+## Architecture
+
+### 6-Stage Pipeline
+
+1. **Discovery**: Locate `aidlc-docs/` directory and categorize artifact files
+2. **Parsing**: Extract structured data from markdown files and source code
+3. **AI Analysis** (optional): Multi-agent semantic relationship discovery
+4. **Graph Building**: Construct NetworkX directed graph of relationships
+5. **Coverage Analysis**: Detect gaps and calculate metrics
+6. **Report Generation**: Render markdown or HTML reports
+
+### Multi-Agent AI System
+
+When AI analysis is enabled, the tool uses 4 specialized Strands agents:
+
+- **Requirements → Stories Agent**: Maps business requirements to user stories
+- **Stories → Units Agent**: Traces user stories to implementation units
+- **Units → Components Agent**: Links units to design components  
+- **Components → Code Agent**: Connects components to source files
+
+Each agent is focused on a specific artifact pair, preventing context pollution and enabling parallel analysis.
+
+## Project Structure
+
+```
+AIDLC-Traceability/
+├── src/traceability/        # Main implementation
+│   ├── cli.py              # Click-based CLI
+│   ├── pipeline.py         # Pipeline orchestration
+│   ├── models.py           # Pydantic data models
+│   ├── discovery.py        # Artifact discovery
+│   ├── graph.py            # NetworkX graph builder
+│   ├── analysis.py         # Coverage gap detection
+│   ├── agent.py            # Strands AI integration
+│   ├── parsers/            # Specialized parsers
+│   │   ├── requirements.py
+│   │   ├── stories.py
+│   │   ├── units.py
+│   │   ├── code_plans.py
+│   │   ├── components.py
+│   │   └── code.py         # Code parser with boilerplate detection
+│   └── generators/         # Report generators
+│       ├── markdown.py
+│       └── html.py
+├── input-docs/             # Original specifications
+├── tests/                  # Test suite
+└── pyproject.toml         # Project configuration
+```
+
+## Technical Stack
+
+- **Language**: Python 3.12+
+- **CLI Framework**: Click
+- **Key Libraries**:
+  - `pydantic` - Data validation and models
+  - `networkx` - Graph construction and analysis
+  - `strands-agents` - AI-powered relationship discovery
+  - `boto3` - Amazon Bedrock integration
+  - `jinja2` - HTML template rendering
+  - `rich` - Terminal output formatting
+- **Linter**: Ruff (120 char line length)
+- **Test Framework**: pytest
+
+## Development
+
+```bash
+# Run linter
+ruff check src/
+
+# Run tests (when implemented)
+pytest
+
+# Install in editable mode
+uv sync
+```
+
+## Output Examples
+
+### Markdown Report
+
+The markdown report includes:
+- Summary statistics (artifact counts, coverage percentages)
+- Complete traceability matrix showing all relationships
+- Gap analysis highlighting orphaned artifacts
+- Detailed artifact listings by type
+
+### HTML Report
+
+The HTML report provides:
+- Interactive dark mode toggle
+- Resizable sidebar for navigation
+- Collapsible sections
+- Syntax-highlighted code snippets
+- Visual coverage indicators
+
+## AI-DLC Context
+
+This tool is designed to analyze projects built using the AI-Driven Development Life Cycle (AI-DLC) methodology. AI-DLC projects typically maintain their artifacts in an `aidlc-docs/` directory with standardized markdown files.
+
+## Disclaimer
+
+This tool generates traceability documentation to support your development and compliance workflows. It does not provide legal, regulatory, or compliance advice, and does not guarantee compliance with any specific standard or regulation. Users are solely responsible for ensuring their projects meet applicable regulatory requirements. See [LEGAL_DISCLAIMER.md](LEGAL_DISCLAIMER.md) for full terms.
+
+## License
+
+This project is licensed under the [MIT License](LICENSE).
@@ -0,0 +1,106 @@
+<!--
+SPDX-License-Identifier: MIT
+Copyright (c) 2026 AIDLC Traceability Tool Contributors
+-->
+
+# AI Compliance Documentation
+
+## GenAI Use Case Classification
+
+| Attribute | Value |
+|-----------|-------|
+| **Use Case** | Development tooling — automated traceability analysis |
+| **Risk Level** | LOW |
+| **Domain** | Software engineering documentation |
+| **Decision Impact** | Advisory only — generates reports for human review |
+| **PII Processing** | None — tool processes code and documentation artifacts |
+| **Safety-Critical** | No — tool does not make health, financial, legal, or safety decisions |
+
+### Risk Justification
+
+This is a **low-risk** GenAI use case because:
+1. The AI generates suggested relationships between development artifacts (requirements, stories, code)
+2. All AI output is validated against known artifact IDs before inclusion in reports
+3. Reports are for informational and documentation purposes; no automated decisions are made
+4. Users review the generated traceability matrix and make their own compliance determinations
+5. The tool can operate without AI (`--no-ai`), making AI an optional enhancement
+
+## Third-Party Model Usage
+
+### Amazon Bedrock — Claude Sonnet
+
+| Attribute | Value |
+|-----------|-------|
+| **Provider** | Anthropic (via Amazon Bedrock marketplace) |
+| **Model** | Claude Sonnet 4 (`us.anthropic.claude-sonnet-4-20250514-v1:0`) |
+| **Access Method** | Amazon Bedrock API (on-demand) |
+| **Data Retention** | None — Amazon Bedrock does not retain customer prompt/completion data |
+| **Training Data Usage** | None — customer data is not used for model training |
+
+### Legal Approval and Right to Use
+
+| Component | License/Terms | Approval Status |
+|-----------|--------------|-----------------|
+| **Claude Sonnet (via Amazon Bedrock)** | [AWS Service Terms](https://aws.amazon.com/service-terms/) — Amazon Bedrock section | Pre-approved: Amazon Bedrock marketplace models are available to all AWS customers with Amazon Bedrock access. No separate Anthropic license required. |
+| **Strands Agents SDK** (`strands-agents`) | Apache License 2.0 ([source](https://github.com/strands-agents/strands-agents)) | Pre-approved: Open-source, permissive license compatible with MIT. No usage restrictions or distribution limitations. |
+| **Strands Agents Tools** (`strands-agents-tools`) | Apache License 2.0 | Pre-approved: Same terms as strands-agents SDK. |
+| **boto3** (AWS SDK) | Apache License 2.0 | Pre-approved: Official AWS SDK, open source. |
+
+**Organizational approval**: Users deploying this tool should verify that their organization's policies permit the use of Amazon Bedrock and the Claude model family. Many organizations pre-approve all Amazon Bedrock marketplace models under their AWS Enterprise Agreement.
+
+## Third-Party Framework Usage
+
+### Strands Agents SDK
+
+| Attribute | Value |
+|-----------|-------|
+| **Package** | `strands-agents` |
+| **License** | Apache License 2.0 |
+| **Source** | Open source |
+| **Purpose** | Agent orchestration framework for Amazon Bedrock model invocation |
+| **Data Handling** | SDK passes prompts to Amazon Bedrock API; no independent data collection |
+
+## Implemented AI Security Controls
+
+The following security controls are implemented in `src/traceability/agent.py` and the pipeline:
+
+| Control | Implementation | File:Line |
+|---------|---------------|-----------|
+| **Input isolation** | Each of 4 agents receives only its relevant artifact pair; no cross-agent data leakage | `agent.py:86-170` |
+| **Static system prompts** | System prompts are hardcoded strings; no user input is injected into system prompts | `agent.py:86-170` |
+| **Output format enforcement** | Agents are instructed to respond in JSON only; non-JSON responses are discarded | `agent.py:173-228` |
+| **Artifact ID validation** | All `source_id` and `target_id` values validated against known parsed artifact IDs | `agent.py:189-215` |
+| **Invalid relationship filtering** | Relationships referencing non-existent artifacts are silently discarded and counted | `agent.py:205-215` |
+| **Output sanitization** | AI-generated text is not rendered as raw content; only validated artifact IDs are used to create graph edges. Report generators escape all artifact content via `html.escape()` before rendering | `generators/html.py:116-117` |
+| **Graceful degradation** | Amazon Bedrock failures are caught; pipeline falls back to heuristic-only analysis | `pipeline.py:229-234` |
+| **Data volume limits** | Source code reading limited to 30 files, 200 lines each | `agent.py:50-65` |
+| **No code execution** | No `eval()`, `exec()`, or dynamic code execution of AI responses | Verified by Bandit scan |
+| **Configurable opt-out** | AI analysis is fully optional via `--no-ai` flag | `cli.py:26` |
+
+For detailed technical documentation of these controls, see [docs/ai-security.md](ai-security.md).
+
+## No Training Data Used
+
+This tool does not:
+- Train or fine-tune any AI models
+- Create or manage training datasets
+- Store AI interaction data for future training
+- Use any third-party datasets beyond the user's own project artifacts
+
+## Bias and Fairness Considerations
+
+### Nature of AI Analysis
+
+The AI agents perform **artifact relationship mapping** — connecting requirements to stories, stories to code, etc. This is a technical documentation task, not a decision-making task affecting individuals.
+
+### Potential Bias Vectors
+
+| Vector | Risk | Mitigation |
+|--------|------|-----------|
+| Naming bias | AI may favor artifacts with descriptive names over terse ones | Heuristic linker provides baseline; AI adds to it |
+| Language bias | Non-English artifact names may produce fewer matches | Not applicable — tool targets English-language AI-DLC projects |
+| Completeness bias | AI may over-connect well-documented artifacts, under-connect sparse ones | Gap analysis independently identifies unconnected artifacts |
+
+### Fairness Assessment
+
+The tool's AI analysis does not impact individuals, hiring, lending, healthcare, or other domains where fairness concerns typically arise. Its output is technical documentation reviewed by engineers.