- Legal Disclaimer
- Generative AI and Third-Party Models Disclaimer
- 1. Overview
- 2. Solution Design
- 2.1. Challenges
- 2.2. Large Language Models
- 2.3. Contract Types and Guidelines Management
- 2.4. Architecture
- 2.5. How a contract is analyzed
- 2.6. Optional: Legislation Compliance Checking
- 2.7. Contract Type and Guidelines Import Workflow
- 2.8. Configuration Management
- 2.9. Web Application
- 2.10. Multilanguage Support
- 2.11. DynamoDB tables schema
- 2.12. REST APIs
- 2.12.1. GET /jobs
- 2.12.2. GET /jobs/{job_id}
- 2.12.3. POST /jobs
- 2.12.4. GET /contract-types
- 2.12.5. GET /contract-types/{contract_type_id}
- 2.12.6. POST /contract-types
- 2.12.7. PUT /contract-types/{contract_type_id}
- 2.12.8. DELETE /contract-types/{contract_type_id}
- 2.12.9. POST /import/contract-types
- 2.12.10. GET /import/contract-types/{import_job_id}
- 2.12.11. GET /guidelines
- 2.12.12. GET /guidelines/{contract_type_id}/{clause_type_id}
- 2.12.13. POST /guidelines
- 2.12.14. PUT /guidelines/{contract_type_id}/{clause_type_id}
- 2.12.15. DELETE /guidelines/{contract_type_id}/{clause_type_id}
- 2.12.16. POST /guidelines/{contract_type_id}/{clause_type_id}/generate-questions
- 2.12.17. POST /guidelines/{contract_type_id}/{clause_type_id}/generate-examples
- 3. Cost Analysis and Pricing
- 4. Roadmap to Production
- 5. Setup Steps
This Contract Compliance Analysis prototype does not provide legal advice, nor does it serve as a substitute for professional legal counsel. Legal matters are often complex and fact-specific, requiring careful consideration of applicable laws, regulations, and individual circumstances. You should make your own independent assessment of all analysis results and seek advice from licensed legal professionals for any legal matters you encounter.
During the prototyping engagement/proof of concept, AWS may use third-party models ("Third- Party Models") that AWS does not own, and that AWS does not exercise control over. By using any prototype or proof of concept from AWS you acknowledge that the Third-Party Models are "Third-Party Content" under your agreement for services with AWS. You should perform your own independent assessment of the Third-Party Models. You should also take measures to ensure that your use of the Third-Party Models complies with your own specific quality control practices and standards, and the local rules, laws, regulations, licenses and terms of use that apply to you, your content, and the Third-Party Models. AWS does not make any representations or warranties regarding the Third-Party Models, including that use of the Third-Party Models and the associated outputs will result in a particular outcome or result. You also acknowledge that outputs generated by the Third-Party Models are Your Content/Customer Content, as defined in the AWS Customer Agreement or the agreement between you and AWS for AWS Services. You are responsible for your use of outputs from the Third-Party Models.
This document details a solution prototype developed to assist legal teams in analyzing contract compliance. The prototype leverages Generative Artificial Intelligence (GenAI) to evaluate contract clauses against predefined guidelines and provide feedback on their adherence to the required standards.
The prototype is powered by AWS Machine Learning services and a Large Language Model (LLM) capable of understanding legal terminology and concepts. Such a LLM is used to perform advanced natural language tasks, such as clause classification and compliance evaluation, based on the provided guidelines.
The scope of this prototype is all around text, hence language, therefore a Machine Learning model trained to understand language is a key component.
A suitable Language model must meet these requirements:
-
Legal understanding. Understanding the terminology and concepts common in contracts requires knowledge of the legal domain.
-
Analytical capabilities. Advanced analytical skills are necessary to properly identify and process obligations, rights, restrictions in a contract.
-
Document structure handling. Contracts from different organizations can have varying layouts and structures, requiring the model to adapt without rigid assumptions about clause formatting.
The prototype uses foundation models from Amazon Bedrock for various AI-powered tasks. Models are configurable through a central configuration file.
Amazon Nova is a new generation of state-of-the-art foundation models introduced by Amazon, exclusively available in Amazon Bedrock. These models deliver frontier intelligence and industry-leading price performance. The Nova family includes:
-
Amazon Nova Micro: Text-only model with the lowest latency, optimized for speed and cost
-
Amazon Nova Lite: Very low-cost multimodal model (text, image, video inputs)
-
Amazon Nova Pro: Highly capable multimodal model with best combination of accuracy, speed, and cost
-
Amazon Nova Premier: Most capable model for complex reasoning tasks (in training)
Amazon Nova models excel in Retrieval-Augmented Generation (RAG), function calling, and agentic applications. They can process lengthy documents up to 300K tokens and understand content in over 200 languages.
The prototype also supports Anthropic Claude models as an alternative model family available through Amazon Bedrock. Claude models offer advanced capabilities for complex reasoning tasks, with features including extended context windows and strong performance in coding, data analysis, and content synthesis.
The prototype supports configurable model selection, with model identifiers stored in AWS Systems Manager Parameter Store (see Configuration Management). This allows:
-
Global configuration: Set a default model for all tasks
-
Task-specific overrides: Use different models for specific workflows
-
Easy model switching: Change models without code deployment
-
Cost optimization: Balance performance vs. cost for different tasks
Available models include Amazon Nova (Micro, Lite, Pro, Premier) and Anthropic Claude (3.5 Haiku, 3.5 Sonnet, 3.7 Sonnet).
The prototype provides two approaches for managing contract types and their associated guidelines:
The prototype includes a web-based interface and REST APIs for managing contract types and guidelines. This approach allows users to:
-
Create, update, and delete contract types through the UI
-
Define and manage guidelines for each contract type
-
Import guidelines from structured text using AI-powered extraction
-
Generate evaluation questions using the LLM
This approach is recommended for production use as it provides a user-friendly interface and maintains data consistency through the application layer.
The prototype supports importing multiple contract types and their guidelines from a JSON file using the load_guidelines.py script. This approach is useful for:
-
Initial system setup with predefined contract types and guidelines
-
Migrating guidelines between environments
-
Bulk updates to existing guidelines
-
Version control of guideline definitions
The JSON file contains contract type definitions with their associated guidelines. Each contract type includes:
-
contract_type_id: Unique identifier for the contract type
-
name: Contract type name
-
description: Description of the contract type
-
guidelines: Array of clause type definitions
Each guideline includes:
-
clause_type_id: Unique identifier for the clause type
-
name: Clause type name
-
standard_wording: Reference wording for the clause type
-
level: Impact level (low, medium, or high) for non-compliance
-
evaluation_questions: Array of binary questions for compliance evaluation
-
examples: Array of example clauses for classification reference
Example JSON structure:
{
"contract_types": [
{
"contract_type_id": "service_agreement",
"name": "Service Agreement",
"description": "Standard service provider agreement",
"guidelines": [
{
"clause_type_id": "payment_terms",
"name": "Payment Terms",
"standard_wording": "Payment shall be made within 30 days...",
"level": "high",
"evaluation_questions": [
"Does the clause specify payment terms?",
"Are payment deadlines clearly defined?"
],
"examples": [
"Payment is due within 30 days of invoice date.",
"The client agrees to pay within net 30 terms."
]
}
]
}
]
}To import guidelines from a JSON file:
python scripts/load_guidelines.py --json-file guidelines.jsonOptions:
-
--no-clear-existing: Append to existing guidelines instead of replacing -
--dry-run: Validate the JSON file without importing -
--backend-stack-name: Specify a custom backend stack name (default: MainBackendStack)
The script validates the JSON structure, checks for required fields, and imports the data into the DynamoDB Guidelines and ContractTypes tables.
Evaluation questions are binary questions used to assess whether a contract clause is compliant with the standard wording for a clause type. The prototype provides two methods for generating these questions:
API-Based Generation (Recommended): When creating or updating guidelines through the web interface, the system can generate evaluation questions using the LLM, allowing immediate review and editing.
JSON File Definition: For bulk imports, evaluation questions can be predefined in the JSON file, allowing version control and consistent evaluation criteria across environments.
The question-based evaluation approach makes LLM behavior more deterministic and enables legal experts to customize the evaluation process without modifying prompts.
The architecture consists of:
-
Contract Analysis Workflow: Orchestrated multi-step pipeline that processes contracts through validation, preprocessing, classification, evaluation, and risk assessment
-
Guideline Management: Storage and management of contract types, clause types, and evaluation criteria
-
AI-Powered Import: AI-assisted extraction of guidelines from reference contracts using LLMs
-
Legislation Compliance Checking: AI-powered verification of contract clauses against legislation requirements using semantic search and LLM analysis
The prototype is built on a serverless, event-driven architecture that leverages AWS managed services for scalability, reliability, and cost-effectiveness:
-
Serverless: No infrastructure management required, automatic scaling
-
Event-Driven: Decoupled components communicate through events
-
Configurable Models: LLM selection through configuration without code changes
-
Secure by Default: Authentication, encryption, and access controls built-in
The following diagrams illustrate the prototype architecture at different levels of detail:
The core contract analysis workflow processes contracts through validation, preprocessing, classification, evaluation, and risk assessment:
Guidelines Compliance Architecture
The AI-powered import workflow extracts contract types and guidelines from reference contracts:
The optional legislation compliance feature verifies contract clauses against legislation requirements:
The prototype leverages the following AWS managed services:
Amazon Bedrock is a fully managed service providing access to high-performing foundation models from leading AI companies through a single API. Bedrock is serverless, eliminating infrastructure management. Knowledge Bases for Amazon Bedrock enable Retrieval Augmented Generation (RAG) by converting documents into embeddings stored in vector databases for semantic search.
Usage in this prototype:
-
Foundation models: Amazon Nova (Micro, Lite, Pro, Premier) and Anthropic Claude (3.5 Haiku, 3.5 Sonnet, 3.7 Sonnet)
-
Tasks: Contract preprocessing, clause classification, compliance evaluation, guideline extraction
-
Configuration: Model selection via SSM Parameter Store without code changes
-
Knowledge Base: Vector knowledge base with Titan Embeddings for legislation document retrieval (optional feature)
Amazon Bedrock AgentCore is a platform for building, deploying, and operating secure and scalable AI agents using fully-managed services. AgentCore Runtime provides serverless infrastructure with session isolation and support for both low-latency real-time iterations and long-running tasks.
Usage in this prototype:
-
Optional feature: Used for legislation compliance checking
-
Runtime deployment: Containerized Python agent deployed on serverless ARM64 infrastructure
-
Session management: Automatic session handling with 15-minute timeout for agent invocations
-
Integration: Agent accesses Bedrock Knowledge Base for legislation retrieval and DynamoDB for clause data
Amazon OpenSearch Serverless is an on-demand, auto-scaling configuration that automatically adjusts compute capacity based on application needs, eliminating infrastructure management.
Usage in this prototype:
-
Optional feature: Used for legislation compliance checking
-
Vector store: Stores legislation document embeddings for semantic search
-
Embeddings: Titan Embeddings model chunks and indexes legislation documents
-
Integration: Bedrock Knowledge Base uses OpenSearch Serverless for retrieval
AWS Lambda is a serverless compute service that runs code without provisioning servers. Lambda automatically scales and manages compute resources.
Usage in this prototype:
-
Python runtimes: Functions use Python 3.12 and 3.13 (latest LTS, supported until October 2029)
-
API endpoints: Jobs API, Contract Types API, Guidelines API (all Python 3.13)
-
Workflow steps: Preprocessing, classification, evaluation, risk calculation functions
-
Observability: AWS Lambda Powertools for Python (v3) provides structured logging, metrics, and distributed tracing
-
Code sharing: Lambda layers for common code and dependencies across functions
Amazon S3 provides object storage with automatic server-side encryption for all new uploads using Amazon S3 managed keys (SSE-S3).
Usage in this prototype:
-
Contract Documents Bucket: Stores uploaded contracts with versioning enabled, SSE-S3 encryption, and 90-day lifecycle expiration
-
Server Access Logs Bucket: Dedicated bucket for audit trails with block public access enabled
-
Legislation Documents Bucket (optional): Stores legislation documents for Knowledge Base ingestion with no expiration policy
-
Security: All buckets enforce SSL/TLS, block public access, and use CORS configuration for web application integration
Amazon DynamoDB is a serverless NoSQL database with on-demand capacity that automatically scales based on application traffic. Point-in-time recovery (PITR) enables restoration to any point in time up to 35 days.
Usage in this prototype:
-
Five tables: Guidelines, Clauses, Jobs, ContractTypes, ImportJobs
-
Billing: On-demand capacity mode (pay-per-request) for automatic scaling
-
Data protection: Point-in-time recovery enabled, AWS-managed encryption
-
Access patterns: Partition keys, sort keys, and Global Secondary Index for efficient queries
See DynamoDB tables schema for detailed table structures.
Amazon API Gateway enables creation of REST APIs with built-in authorization using Amazon Cognito user pools. Clients must obtain identity or access tokens from Cognito and supply them in the Authorization header.
Usage in this prototype:
-
REST API: Exposes endpoints for jobs, contract types, guidelines, documents, and legislation management
-
Authorization:
COGNITO_USER_POOLSauthorizer secures all endpoints -
Integration: Lambda functions for business logic, direct S3 integration for document operations
-
Validation: Request validators for body and parameter validation
See REST APIs for detailed endpoint specifications.
AWS Step Functions uses state machines (workflows) to orchestrate multi-step applications. Workflows are defined using Amazon States Language (ASL) and can handle errors, pass data between states, and invoke AWS services.
Usage in this prototype:
-
ContractAnalysisWorkflow: 6-step state machine for contract validation, preprocessing, classification, evaluation, and risk assessment
-
ContractImportWorkflow: 4-step state machine for AI-powered guideline extraction from reference contracts
-
Error handling: Retry logic, timeout configurations, and CloudWatch Logs integration
-
Concurrency: Parallel processing with configurable max_concurrency (1 for classification, 10 for evaluation)
See [Contract Analysis Workflow] for detailed workflow steps.
Amazon EventBridge is a serverless event bus that routes events from sources to targets. Rules evaluate incoming events against patterns and send matching events to specified targets.
Usage in this prototype:
-
Event-driven integration: ContractAnalysisWorkflow emits
PreProcessedContractevents after preprocessing -
Conditional triggering: EventBridge rule triggers CheckLegislation workflow only when
legislationCheck.legislationIdparameter is present -
Decoupled architecture: Enables optional features without modifying core workflows
AWS Systems Manager Parameter Store provides secure, scalable, centralized storage for configuration data. Parameters can be referenced in scripts, commands, and configuration workflows without code changes.
Usage in this prototype:
-
Configuration storage: All parameters stored under
/ContractAnalysis/*hierarchy -
Model identifiers: Global and task-specific LLM model IDs (e.g.,
LanguageModelId,ContractPreprocessing/LanguageModelId) -
Runtime flexibility: Lambda functions and Bedrock AgentCore agents read parameters at runtime
-
Hierarchical lookup: Task-specific parameters override global defaults
Amazon Cognito provides authentication and authorization for applications. When users sign in to a user pool, applications receive JSON web tokens (JWTs). Identity pools provide temporary AWS credentials for accessing AWS services.
Usage in this prototype:
-
User Pool: Manages user authentication with password policy enforcement (8+ characters, mixed case, digits, symbols) and advanced security mode enabled
-
User Pool Client: Configured for web application with user password authentication flow
-
Identity Pool: Provides temporary AWS credentials for authenticated users to access S3 buckets for document upload/download
-
API Gateway Integration: Cognito authorizers secure all REST API endpoints
The approach to analyze a contract comprises six different steps, in this order:
-
Validate the contract type exists and input parameters are correct.
-
Split the contract into clauses using LLM text understanding capabilities.
-
Emit a notification event to EventBridge (triggers optional legislation checking if enabled, see Optional: Legislation Compliance Checking).
-
Determine the type(s) for each clause. In Machine Learning terms, this means performing Text Classification, to determine a class/type for a given input. For the remainder of this document, that task will be referred as Clause Classification.
-
Evaluate each clause against the type(s) guidelines. Out of the type(s) determined by the previous classification step, the corresponding standard wording of those types are then used as the reference to evaluate how compliant each clause is, through questions previously generated and curated out of the standard wording.
-
Assess contract risk. This step calculates the overall contract risk level based on guideline compliance. It analyzes risk levels (low, medium, high) of non-compliant and missing clause types against their impact levels, then compares totals against configurable thresholds per contract type to determine if the contract is high-risk or low-risk.
The six steps comprise the workflow that is orchestrated by AWS Step Functions. Below, a visual representation of a workflow that ran into completion:
Contract Processing - Guidelines compliance Workflow
The following sections detail each one of those six steps.
The first step validates the input parameters before processing begins. This AWS Lambda function:
-
Verifies the specified contract type exists in the ContractTypes DynamoDB table
-
Validates required input parameters (contract type ID, document path, output language)
-
Returns validation errors if any checks fail, preventing unnecessary processing
This validation step ensures data integrity and provides early failure detection before expensive LLM operations begin.
The second step of the workflow executes an AWS Lambda function to split the document content into clauses.
To handle the layout variability of contracts, this step leverages LLM text understanding capabilities (configurable via ContractPreprocessing/LanguageModelId parameter) to intelligently split the contract into clauses.
Each clause is stored as a separate record in the Clauses Table on Amazon DynamoDB.
This preprocessing step does not ignore text of the input contract. Deciding for text removal would require either performing a classification task through a Language Model or non Machine Learning heuristics to identify whether a paragraph (or group of paragraphs) is a clause or not.
The third step emits a PreProcessedContract event to Amazon EventBridge after successful clause extraction. This event contains:
-
Job ID
-
Contract Type ID
-
Output Language
-
List of clause numbers
-
Additional checks configuration (including legislation check parameters)
This event-driven architecture enables optional features like legislation compliance checking to be triggered automatically without modifying the core workflow. If legislation checking is enabled (via AdditionalChecks.legislationCheck.legislationId parameter), EventBridge triggers the CheckLegislation workflow to verify clauses against legislation requirements.
This is the fourth step to determine the type(s) of each contract clause. This is done in a loop, processing one clause at a time (in order to reduce throttling from Amazon Bedrock, although the concurrency level can be readjusted in the corresponding AWS Step Functions Map configuration code - currently set to max_concurrency=1).
The core idea of this step is to have a single prompt that evaluates all possible clause types at once, to determine the one(s) that are applicable to the given clause. The examples defined in the guidelines are augmented into the prompt, to help the model better perform such a classification task. The LLM used is configurable via ContractClassification/LanguageModelId parameter.
Other strategies for Clause Classification: consult the corresponding section on Machine Learning improvements
This fifth step of the workflow executes an AWS Lambda function to evaluate a clause against a set of evaluation questions associated with its respective clause type. It runs a loop, orchestrated by Step Functions (max_concurrency=10), doing the following for each clause:
-
The AWS Lambda function retrieves from Amazon DynamoDB the clause text, its context (i.e, the ascending clauses) and the results from the classification step. For each classified type, the function retrieves the corresponding type record from Amazon DynamoDB Guidelines table. Such a record contains a set of binary evaluation questions about the clause text, which will be used in the evaluation.
-
The AWS Lambda function accesses Amazon Bedrock to execute a prompt on the configured LLM (via
ContractEvaluation/LanguageModelIdparameter). The model is prompted to analyze the clause text and the clause context to answer each one of the evaluation questions, with 'Yes' or 'No'. If all questions are positively answered, the clause is considered compliant with the rule. Otherwise, the clause is considered not compliant. -
After evaluation of the clause against the questions for each of its classified types, the clause record is updated on the Amazon DynamoDB Clauses table, with the compliance status ('True' or 'False') and the evaluation analysis for each type. The evaluation analysis contains the text for each of the evaluation question and its respective answer.
This sixth and final step assesses the overall risk classification of a contract based on the quantity and impact level of clause types that are non-compliant or not present in the contract. The process involves the following steps:
-
Group the clause types by impact level (low, medium and high)
-
For each impact level group, count the total number clause types that are compliant, non-compliant and missing
-
Assign a risk level (low, medium or high) to each clause type based on its impact level and compliance status
-
Count the total number of clause types for each risk level
-
Determine the overall contract risk classification based on the total number of clause types for each risk level.
The compliance status for each clause type is determined as follows:
-
If all clauses in a contract that were classified under a certain clause type is compliant, the clause type is considered compliant
-
If one or more clauses classified under a certain clause type is non-compliant, the clause type as a whole is considered non-compliant
-
If no clause was classified under a certain clause type, the clause type is considered missing
The risk level for each clause type is determined according to the risk matrix displayed below.
| Low Impact | Medium Impact | High Impact | |
|---|---|---|---|
Compliant |
No Risk |
No Risk |
No Risk |
Non-compliant |
Low Risk |
Medium Risk |
High Risk |
Missing |
Medium Risk |
High Risk |
High Risk |
With the risk level determined for each clause type, the overall contract risk is defined based on the total number of clause types assessed for each risk level. Each contract type has configurable thresholds that define the maximum acceptable number of clause types for each risk level before a contract is classified as high-risk. When creating a contract type, these thresholds default to 0, 1, and 3 respectively, but can be customized. The thresholds are evaluated in the following order:
-
If the number of clause types with high risk exceeds the
high_risk_threshold, the contract is classified as high-risk -
If the number of clause types with medium risk exceeds the
medium_risk_threshold, the contract is classified as high-risk -
If the number of clause types with low risk exceeds the
low_risk_threshold, the contract is classified as high-risk -
Otherwise, the contract is classified as low-risk
These thresholds are stored in the ContractTypes DynamoDB table and can be customized per contract type through the web application or API to match the organization’s specific risk tolerance and compliance requirements.
The prototype includes an optional legislation compliance verification feature that can be deployed separately and integrates with the main contract analysis workflow through EventBridge events.
The legislation checking feature uses several advanced AWS services:
-
Amazon Bedrock AgentCore: Serverless runtime for deploying containerized AI agents
-
Amazon Bedrock Knowledge Base: Vector database for legislation documents using Titan Embeddings
-
Amazon OpenSearch Serverless: Vector store for semantic search across legislation
-
EventBridge: Event-driven integration with contract analysis workflow
-
Dedicated S3 Bucket: Storage for legislation documents (no lifecycle expiration)
When a contract analysis job is started with legislation checking enabled:
-
The contract analysis workflow completes preprocessing and emits a
PreProcessedContractevent to EventBridge -
EventBridge triggers the CheckLegislation workflow if
legislationCheck.legislationIdis present -
The workflow processes each clause sequentially:
-
Invokes the Bedrock AgentCore agent for each clause
-
The agent retrieves relevant legislation sections from the Knowledge Base
-
The LLM analyzes the clause against legislation requirements
-
Results (compliant: true/false + detailed analysis) are stored in the Clauses DynamoDB table
-
-
After all clauses are processed, the workflow calculates overall contract compliance:
-
Queries all clause results from DynamoDB
-
If ANY clause is non-compliant with legislation, the entire contract is marked as non-compliant
-
If ALL clauses are compliant, the contract is marked as compliant
-
-
The overall compliance status is stored in the Jobs table as
legislation_compliant(boolean)
The stack provides REST API endpoints for managing legislation documents:
-
POST /legislations: Upload legislation documents to S3 and trigger Knowledge Base ingestion
-
GET /legislations: List available legislation documents
Legislation documents are automatically chunked, embedded using Titan Embeddings, and indexed in OpenSearch Serverless for semantic retrieval.
The agent is a containerized Python application (ARM64) that:
-
Receives clause analysis requests from Step Functions
-
Queries the Knowledge Base for relevant legislation sections
-
Uses the LLM to perform compliance analysis
-
Returns structured results with compliance status and reasoning
The agent has access to:
-
Knowledge Base for legislation retrieval
-
Clauses DynamoDB table for reading/writing results
-
Bedrock for LLM inference
-
SSM Parameter Store for configuration
The prototype provides an AI-powered workflow to extract contract type definitions and guidelines from reference contract text. This reduces manual effort in guideline creation and accelerates the setup of new contract types.
The ContractImportWorkflow is a Step Functions state machine with four steps:
-
Initialize Import: Creates an import job record in DynamoDB and stores the reference text in S3
-
Extract Contract Type Info: Uses the LLM to analyze the reference contract and extract:
-
Contract type name
-
Contract type description
-
Overall contract structure and purpose
-
-
Extract Clause Types: Uses the LLM to identify and extract all clause types from the reference contract, including:
-
Clause type name
-
Standard wording for each clause type
-
Impact level (low, medium, high)
-
Example clauses
-
AI-generated evaluation questions
-
-
Finalize Import: Persists the extracted contract type and all guidelines to DynamoDB tables
Through the web application:
-
Navigate to Contract Types management
-
Click "Import Contract Type"
-
Paste the reference contract text (can be an existing contract that follows the desired structure)
-
Submit - the AI workflow extracts:
-
Contract type definition
-
All clause types with standard wording
-
Evaluation questions for each clause type
-
Example clauses
-
-
Review and edit the extracted guidelines as needed
This feature significantly reduces the time to set up new contract types from hours to minutes.
The workflow uses:
-
Foundation Models: Configurable LLMs from Amazon Bedrock
-
Structured Prompts: Carefully crafted prompts ensure consistent extraction format
-
Step Functions: Orchestrates the multi-step extraction process
-
DynamoDB: Stores import job status and final results
-
S3: Temporary storage for reference contract text
The prototype uses a centralized configuration system that allows runtime settings to be changed without code deployment. Configuration is managed through a YAML file that is synchronized to AWS Systems Manager Parameter Store during deployment.
The configuration file is created from app_properties.template.yaml during initial setup. To initialize your configuration:
python scripts/init_app_properties.pyThis creates app_properties.yaml from the template, which you can then customize for your environment. The template provides sensible defaults and examples of all available configuration options.
Key configuration options:
-
Global Model Configuration: Set a default model for all tasks
LanguageModelId: "amazon.nova-pro-v1:0"
-
Task-Specific Model Overrides: Use different models for specific workflows
ContractPreprocessing/LanguageModelId: "amazon.nova-lite-v1:0" ContractClassification/LanguageModelId: "amazon.nova-pro-v1:0" ContractEvaluation/LanguageModelId: "amazon.nova-pro-v1:0" ContractTypeExtraction/LanguageModelId: "amazon.nova-lite-v1:0" ExtractClauseTypes/LanguageModelId: "amazon.nova-pro-v1:0" LegislationCheck/LanguageModelId: "us.anthropic.claude-haiku-4-5-20251001-v1:0"
-
Organization Settings: Customize for your organization
CompanyName: "AnyCompany"
The AppPropertiesManager class provides hierarchical parameter lookup with the following priority:
-
Task-specific parameter:
/ContractAnalysis/{TaskName}/{ParameterName} -
Global parameter:
/ContractAnalysis/{ParameterName} -
Default value (if provided in code)
This hierarchical approach allows fine-grained control while maintaining sensible defaults. Parameters are cached for 30 seconds to optimize performance.
To update configuration:
-
Edit
app_properties.yamlin the backend directory -
Run the sync script:
python scripts/apply_app_properties.py --yaml-file app_properties.yaml
-
Changes take effect immediately (within cache TTL) without redeploying Lambda functions
The script supports preview mode (--preview) to see what would change without applying updates.
A web application was created using React with TypeScript and Vite. The application provides a comprehensive interface for managing contract analysis workflows and contract type definitions.
-
Contract Analysis Management: Browse, upload, and analyze contracts with status tracking
-
Contract Type Management: Create, edit, and delete contract types with full CRUD operations
-
Guidelines Management: Define and edit guidelines for each contract type with AI-assisted generation
-
AI-Powered Import: Extract contract types and guidelines from reference contract text using LLM
-
Results Visualization: View detailed analysis results, risk assessments, and clause-by-clause compliance
-
Multi-language Support: Interface available in English, Portuguese (Brazil), and Spanish
The application uses AWS Amplify UI with Amazon Cognito for authentication. User signup is disabled by default (hideSignUp={true}), requiring administrators to create user accounts through the Cognito User Pool. The authentication flow is managed through the Authenticator.Provider context, providing seamless integration with AWS services.
-
Framework: React 19.1.1 with TypeScript 5.9.2
-
Build Tool: Vite 7.1.2 for fast development and optimized production builds
-
Routing: React Router 7.7.1 for client-side navigation
-
UI Components: shadcn/ui components built on Radix UI primitives (@radix-ui/react-*)
-
Styling: TailwindCSS 4.1.11 with custom theme and animations
-
State Management: React hooks and context API
-
API Integration:
-
AWS Amplify 6.15.4 for authentication and S3 operations
-
Axios 1.13.1 for REST API calls
-
AWS SDK for S3 (@aws-sdk/client-s3 3.919.0, @aws-sdk/lib-storage 3.919.0)
-
-
Data Tables: TanStack React Table 8.21.3 for advanced table features
-
Notifications: Sonner 2.0.7 for toast notifications
-
Internationalization: i18next 25.3.4 with browser language detection
-
Testing: Vitest 1.6.0 with React Testing Library
-
/- Home page with job list, contract upload, and filtering by contract type -
/jobs/:jobId- Detailed view of contract analysis results with clause-by-clause breakdown -
/contract-types- Contract type management interface with create, edit, delete operations -
/contract-types/:contractTypeId/guidelines- Guidelines editor for a specific contract type with AI generation features
Home Page - Contract Analysis Job List
Once you open the web application in your browser, the home page displays all contract analysis jobs with filtering capabilities by contract type, status, and review requirements. Click the New Analysis button near the top right corner and select a contract file. Supported formats: PDF (.pdf), Word (.doc, .docx), or plain text (.txt).
For ready-to-use examples, refer to the backend/samples folder.
Once the file is selected, the contract analysis processing task starts and a new entry is added to the page.
The processing will take a couple of minutes. A click to the refresh button displays the current status of all processing tasks.
Once processing is complete, click on the job to view detailed analysis results:
Contract Analysis Results - Overview
The contract analysis results page shows a comprehensive overview including compliance assessment, risk distribution, and classification statistics. Each clause is displayed with its compliance status and impact level.
Detailed clause analysis shows the original clause text, compliance evaluation against guidelines, and AI-generated recommendations. Each clause is color-coded based on its compliance status and risk level.
The risk assessment view provides visual analytics of contract risks, showing distribution across impact levels and highlighting areas requiring attention.
Contract Type Management Interface
The application provides a complete interface for managing contract types:
Creating a Contract Type:
-
Navigate to "Contract Types" from the main menu
-
Click "Create Contract Type"
-
Enter contract type name and description
-
Save - the contract type is created and ready for guidelines
The contract type creation form captures essential information including name, description, party types, risk thresholds, and language preferences.
Importing from Reference Contract (AI-Powered):
Import Contract Type from Reference
-
Click "Import Contract Type"
-
Paste a reference contract text (e.g., an existing contract that follows the desired structure)
-
Submit - the AI extracts:
-
Review the extracted data (progress shown during AI analysis)
-
Edit any fields as needed
-
Save - the contract type and all guidelines are created
Editing/Deleting Contract Types:
-
Click the edit icon to modify name or description
-
Click delete to remove the contract type (also removes all associated guidelines)
Guidelines Management Interface
For each contract type, you can manage its guidelines. The guidelines management page displays all guidelines with search and filtering capabilities, showing guideline counts by impact level.
Creating a Guideline Manually:
-
Navigate to a contract type’s guidelines page
-
Click "Add Guideline"
-
Fill in:
-
Clause type name
-
Standard wording (the canonical text for this clause type)
-
Impact level (low, medium, high)
-
Examples (multiple example clauses)
-
-
Click "Generate Questions" to use AI to create evaluation questions from the standard wording
-
Review and edit the generated questions
-
Save the guideline
The guideline editor allows users to modify clause types, standard wording, impact levels, examples, and evaluation questions. The AI-powered "Generate Questions" feature creates evaluation questions from the standard wording.
Editing Guidelines:
-
Click on any guideline to edit
-
Modify any field (name, standard wording, impact, examples, questions)
-
Regenerate questions if standard wording changed
-
Save changes
Filtering and Search:
-
Use the search box to filter guidelines by name
-
Filter by impact level (low, medium, high)
-
View guideline count and statistics
Evaluation questions are binary (yes/no) questions used to assess clause compliance. The AI generates these questions based on the standard wording to check if a contract clause contains all required elements.
Example: If standard wording mentions "30 days notice period", a generated question might be "Does the clause specify a notice period of at least 30 days?"
Users can:
-
Generate questions using AI
-
Manually add/edit/remove questions
-
Ensure questions are specific and verifiable
The prototype provides comprehensive multilanguage support for both the user interface and AI-generated analysis outputs, enabling global teams to work in their preferred language.
The system supports three languages:
-
English (en): Default language for international operations
-
Spanish (es): Support for Spanish-speaking markets
-
Portuguese - Brazil (pt_BR): Localized support for Brazilian operations
When creating a new contract analysis, users select their preferred report language (English, Spanish, or Portuguese) from a dropdown menu. This selection determines the language of all AI-generated content in the analysis results.
The interface language and report language operate independently. For example, a user can navigate the application in English while generating analysis reports in Portuguese for local compliance teams. This flexibility supports international teams working across different regions and regulatory requirements.
All AI-generated content respects the selected report language, including clause classifications, compliance evaluations, risk assessments, and optional legislation compliance analysis.
The web application implements full internationalization with the following features:
-
Automatic Language Detection: The system automatically detects the user’s browser language on first access
-
Persistent Language Selection: User language preference is stored in browser localStorage and persists across sessions
-
Dynamic Language Switching: Users can change the interface language at any time through the navigation menu without page reload
-
Complete UI Translation: All interface elements are translated, including navigation menus, form labels, buttons, status messages, error messages, and tooltips
Translation files are structured hierarchically by feature area (navigation, home, job details, contract types, guidelines) for maintainability.
The backend Lambda functions process the selected output language throughout the contract analysis workflow:
-
Language Parameter Flow: The
output_languageparameter (ISO language code: en, es, or pt_BR) is captured during job creation and flows through the entire Step Functions workflow -
AI-Generated Content Localization: The LLM prompts include language-specific instructions to generate analysis results in the requested language
-
Classification Results: Clause type classifications include reasons written in the specified language
-
Evaluation Results: Compliance evaluation explanations are generated in the requested language
-
Legislation Compliance Analysis: When the optional legislation checking feature is enabled, the AI agent analyzes contract clauses against legislation documents and generates detailed compliance analysis in the specified language
-
Consistent Language Output: All AI-generated content maintains the same language throughout the analysis
Each record denotes one clause type for a specific contract type.
| Column Name | Type | Comments |
|---|---|---|
contract_type_id |
String |
Partition key. The contract type this guideline belongs to |
clause_type_id |
String |
Sort key. Unique identifier for the clause type |
name |
String |
Clause type name |
level |
String |
Impact level: low, medium, or high |
examples |
List |
Example clauses for this type. Used in the Classification step |
evaluation_questions |
List |
Binary (yes/no) questions used in the Clause Compliance Evaluation step |
standard_wording |
String |
The standard wording that defines what this clause type should contain |
created_at |
String |
Timestamp when the guideline was created |
updated_at |
String |
Timestamp when the guideline was last updated |
Each record represents one clause from a contract, as identified during the preprocessing step. The record is populated by subsequent Classification and Evaluation steps.
| Column Name | Type | Comments |
|---|---|---|
job_id |
String |
Partition key. The Step Functions workflow execution ID |
clause_number |
Number |
Sort key. Sequential number of the clause in the contract |
text |
String |
The clause content |
types |
List |
List of classified types for this clause (see Type object schema below) |
additional_checks |
Map |
Optional additional compliance checks (e.g., legislation_check) |
The types column contains a list of objects, where each object represents a classified clause type:
| Name | Type | Comments |
|---|---|---|
type_id |
String |
The clause type ID |
type_name |
String |
The clause type name |
level |
String |
The clause type impact level (low, medium, high) |
classification_analysis |
String |
The reasoning why the clause matches this type, generated by the LLM |
analysis |
String |
The compliance evaluation explanation, generated by the LLM |
compliant |
Boolean |
Whether the clause is compliant with the guideline for this type |
Each record represents one contract analysis workflow execution.
| Column Name | Type | Comments |
|---|---|---|
id |
String |
Partition key. Step Functions workflow execution ID |
contract_type_id |
String |
The contract type being analyzed |
document_s3_key |
String |
S3 key of the uploaded contract document |
description |
String |
User-provided description of the contract analysis |
legislation_check_execution_arn |
String |
Step Functions execution ARN for the legislation check workflow (when enabled) |
guidelines_compliant |
Boolean |
Overall compliance status for guidelines checking |
legislation_compliant |
Boolean |
Overall compliance status for legislation checking (when enabled) |
total_compliance_by_impact |
Map |
Compliance metrics grouped by impact level (low, medium, high) |
total_clause_types_by_risk |
Map |
Risk metrics grouped by risk level (low, medium, high, none) |
unknown_total |
Number |
Number of guideline types that do not match any clause in the contract |
Each record represents a contract type definition.
| Column Name | Type | Comments |
|---|---|---|
contract_type_id |
String |
Partition key. Unique identifier for the contract type |
name |
String |
Display name of the contract type |
description |
String |
Description of the contract type |
company_party_type |
String |
Role of the company in the contract (e.g., "Contractor", "Client") |
other_party_type |
String |
Role of the other party in the contract (e.g., "Supplier", "Vendor") |
high_risk_threshold |
Number |
Maximum number of high-risk issues allowed (non-compliant high-impact clauses, or missing medium/high-impact clauses) |
medium_risk_threshold |
Number |
Maximum number of medium-risk issues allowed (non-compliant medium-impact clauses, or missing low-impact clauses) |
low_risk_threshold |
Number |
Maximum number of low-risk issues allowed (non-compliant low-impact clauses) |
is_active |
Boolean |
Whether this contract type is active |
default_language |
String |
Default language for this contract type |
created_at |
String |
Timestamp of creation |
updated_at |
String |
Timestamp of last update |
Each record tracks the status of a contract type import workflow execution.
| Column Name | Type | Comments |
|---|---|---|
import_job_id |
String |
Partition key. Unique identifier for the import job |
execution_id |
String |
Step Functions workflow execution ID |
document_s3_key |
String |
S3 key of the reference contract document used for extraction |
contract_type_id |
String |
The contract type ID (set after contract type is created) |
status |
String |
Import workflow status (RUNNING, SUCCEEDED, FAILED, TIMED_OUT, ABORTED) |
current_step |
String |
Current step in the state machine |
progress |
Number |
Progress percentage (0-100) |
error_message |
String |
Error message if the import failed |
created_at |
String |
Timestamp of import job creation |
updated_at |
String |
Timestamp of last update |
The frontend application is backed by REST APIs implemented using Amazon API Gateway. All API logic runs inside AWS Lambda functions written in Python. The following sections document all available endpoints using type signature notation.
Lists all contract analysis jobs with status and metrics.
GET /jobs?contractType={id}
Response: Array<{
id: string
description: string
startDate: string
endDate?: string
checks: { guidelines: {...}, legislation?: {...} }
}>Retrieves detailed analysis results for a specific job.
GET /jobs/{job_id}
Response: {
id: string
description: string
startDate: string
endDate?: string
checks: { guidelines: {...}, legislation?: {...} }
clauses: Array<{
clauseNumber: number
text: string
types: Array<{ typeId, typeName, level, analysis, compliant }>
}>
}Starts a new contract analysis workflow.
POST /jobs
documentS3Key: string
contractTypeId: string
outputLanguage?: "en" | "es" | "pt_BR"
jobDescription?: string
additionalChecks?: {
legislationCheck?: { legislationId: string }
}
Response: {
id: string
documentS3Key: string
contractTypeId: string
startDate: string
}Lists all contract type definitions.
GET /contract-types
Response: Array<{
contractTypeId: string
name: string
description: string
companyPartyType: string
otherPartyType: string
highRiskThreshold: number
mediumRiskThreshold: number
lowRiskThreshold: number
isActive: boolean
defaultLanguage: string
createdAt: string
updatedAt: string
isImported?: boolean
importSourceDocument?: string
}>Retrieves details of a specific contract type.
GET /contract-types/{contract_type_id}
Response: {
contractTypeId: string
name: string
description: string
companyPartyType: string
otherPartyType: string
highRiskThreshold: number
mediumRiskThreshold: number
lowRiskThreshold: number
isActive: boolean
createdAt: string
updatedAt: string
}Creates a new contract type.
POST /contract-types
name: string
description: string
companyPartyType: string
otherPartyType: string
highRiskThreshold?: number
mediumRiskThreshold?: number
lowRiskThreshold?: number
Response: {
contractTypeId: string
name: string
description: string
...
}Updates an existing contract type.
PUT /contract-types/{contract_type_id}
name?: string
description?: string
companyPartyType?: string
otherPartyType?: string
highRiskThreshold?: number
mediumRiskThreshold?: number
lowRiskThreshold?: number
Response: {
contractTypeId: string
name: string
description: string
...
}Deletes a contract type and all its associated guidelines.
DELETE /contract-types/{contract_type_id}
Response: 204 No ContentStarts an AI-powered workflow to extract contract type and guidelines from a reference document.
POST /import/contract-types
documentS3Key: string
description?: string
Response: {
importJobId: string
status: "RUNNING" | "SUCCEEDED" | "FAILED" | "TIMED_OUT" | "ABORTED"
progress: number
contractTypeId: string
createdAt: string
updatedAt: string
}Retrieves status and progress of an import job.
GET /import/contract-types/{import_job_id}
Response: {
importJobId: string
status: string
progress: number
contractTypeId?: string
currentStep?: string
errorMessage?: string
createdAt: string
updatedAt: string
}Lists all guidelines across all contract types. Supports filtering and pagination.
GET /guidelines?contractTypeId={id}&limit={n}&lastEvaluatedKey={key}
Response: {
guidelines: Array<{
contractTypeId: string
clauseTypeId: string
name: string
standardWording: string
level: "low" | "medium" | "high"
examples: Array<string>
evaluationQuestions: Array<string>
createdAt: string
updatedAt: string
}>
lastEvaluatedKey?: string
}Retrieves details of a specific guideline.
GET /guidelines/{contract_type_id}/{clause_type_id}
Response: {
contractTypeId: string
clauseTypeId: string
name: string
standardWording: string
level: "low" | "medium" | "high"
examples: Array<string>
evaluationQuestions: Array<string>
createdAt: string
updatedAt: string
}Creates a new guideline for a contract type.
POST /guidelines
contractTypeId: string
name: string
standardWording: string
level: "low" | "medium" | "high"
examples: Array<string>
evaluationQuestions: Array<string>
Response: {
contractTypeId: string
clauseTypeId: string
name: string
...
}Updates an existing guideline.
PUT /guidelines/{contract_type_id}/{clause_type_id}
name?: string
standardWording?: string
level?: "low" | "medium" | "high"
examples?: Array<string>
evaluationQuestions?: Array<string>
Response: {
contractTypeId: string
clauseTypeId: string
name: string
...
}Deletes a guideline.
DELETE /guidelines/{contract_type_id}/{clause_type_id}
Response: 204 No ContentGenerates evaluation questions for a guideline using AI based on the standard wording.
POST /guidelines/{contract_type_id}/{clause_type_id}/generate-questions
Response: Array<string>Generates example clauses for a guideline using AI based on the standard wording.
POST /guidelines/{contract_type_id}/{clause_type_id}/generate-examples
Response: Array<string>|
Note
|
Legislation compliance endpoints are only available when the CheckLegislationStack is deployed. |
This section analyzes the primary cost drivers for this prototype and demonstrates significant cost savings by using Amazon Nova models compared to Claude models.
The prototype has two main cost components:
1. Amazon Bedrock
-
LLM inference costs for contract analysis, classification, and evaluation
-
Cost varies based on: contract size, guidelines complexity, selected model, and usage volume
-
This is the dominant cost for active contract processing
2. Amazon OpenSearch Serverless (Optional)
-
Required only if the legislation compliance feature is deployed
-
Minimum cost: ~$350/month (2 OCUs minimum at ~$0.24/OCU-hour)
-
This cost is continuous, even when not actively processing contracts
-
Cost optimization alternatives:
-
Amazon S3 Vectors (Preview): Up to 90% cost reduction for vector storage with sub-second query performance - ideal for infrequent queries and long-term storage
-
Aurora PostgreSQL with pgvector: Lower cost for low-usage scenarios with Aurora Serverless v2 (0.5 ACU minimum, scales based on workload)
-
Note: Both alternatives currently do not support the
startsWithmetadata filter (OpenSearch Serverless-only feature). The prototype uses this filter for optional section-based filtering of legislation documents. Code changes would be required to usestringContainsor other filter types instead.
-
3. Other AWS Services
-
Step Functions, DynamoDB, Lambda, S3, API Gateway use pay-per-use pricing
-
These costs scale with usage but are typically much smaller than Bedrock and OpenSearch costs
|
Note
|
The following cost estimates are based on actual processing of the sample contract included with this prototype (English plain text) using the default guidelines. Token counts and costs were captured from real workflow executions. Your actual costs may vary depending on contract size, language, format complexity, guidelines detail, and usage patterns. |
Some scenarios have Amazon Bedrock prompt caching enabled.
| Model | Input Tokens | Output Tokens | Cache Read Tokens | Cache Write Tokens | Total Cost |
|---|---|---|---|---|---|
Claude 3.5 Haiku |
96,627 |
38,273 |
556,512 |
16,368 |
$0.29 |
Amazon Nova Lite |
88,024 |
35,208 |
567,691 |
15,343 |
$0.02 |
Savings: 93% |
|||||
| Model | Input Tokens | Output Tokens | Total Cost |
|---|---|---|---|
Claude 3.5 Sonnet v2 |
706,407 |
58,021 |
$5.98 |
Amazon Nova Pro |
577,291 |
30,384 |
$0.56 |
Savings: 91% |
|||
| Model | Input Tokens | Output Tokens | Cache Read Tokens | Cache Write Tokens | Total Cost |
|---|---|---|---|---|---|
Claude 3.7 Sonnet |
94,944 |
80,574 |
572,880 |
16,368 |
$1.67 |
Amazon Nova Premier |
91,076 |
49,836 |
529,550 |
15,575 |
$1.18 |
Savings: 29% |
|||||
Actual costs depend on several factors:
-
Guidelines complexity: More detailed guidelines require more tokens
-
Contract size: Larger contracts consume more input tokens
-
Selected LLM: Different models have different pricing structures
For the most up-to-date pricing information, refer to the Amazon Bedrock Pricing page.
|
Note
|
The cost analysis above is based on processing the sample contract included with this prototype using the default guidelines. Your actual costs may vary depending on your specific contract sizes, guidelines complexity, and usage patterns. |
It is important to keep in mind that this project is a prototype, which means that although main functionalities and best practices configurations were implemented, there are still pending features that we recommend to be implemented prior to moving this prototype to production.
The scope for this prototype comprises only the most critical requirements to validate the use case, to comply with the time constraints available for development. Hence, the delivered implementation still requires a set of enhancements in order to be fully ready for deployment into production. This section details some areas that should be improved to achieve a production solution based on the prototype. Some of the points are based on AWS Well-Architected Framework, which provides best practices for operating reliable, secure, efficient, and cost-effective systems in the Cloud.
The work done involved a lot of experimentation, although with a restricted time window, so that it is not possible to exercise every possible technical approach. In the following subsections, a couple of suggested topics to be explored to further enhance the prototype.
Machine Learning depends on data and it is not different in this prototype. The data in the guidelines are being heavily used on the prompts sent to the LLM, so enhancing that data is key to optimize accuracy.
-
Curate evaluation questions. Use the web application’s "Generate Questions" feature for each guideline, review the AI-generated questions, enhance them, and add additional questions as needed. Alternatively, use the API endpoint
POST /guidelines/{contract_type_id}/{clause_type_id}/generate-questionsto generate questions programmatically. -
Add more type examples. Use the web application’s guideline editor to add more examples for each clause type, or use the "Generate Examples" feature to create them with AI. This is key to enhance classification behavior.
-
Review types. The semantic overlap between two types introduces ambiguity in the sense that both types could be selected by the Classification step. Making the types more specific can remove such an overlap. If for example there is some sort of parent-child relationship between two types, where one is "included" in the other, one possible approach could be keeping only the more general type.
Other techniques can be experimented on Clause Classification, to improve accuracy and/or reduce costs:
-
Bring hierarchical context. Contracts can have varying layouts and clauses can be structured in hierarchies, so that the context of a given clause is the complement of a parent clause. The idea is to bring a number of previous clauses and augment into the prompt, so that the LLM could consider such additional context and reason over the hierarchy if needed. Being a more complex operation, a more capable model (such as Amazon Nova Pro or Claude Sonnet) would be recommended to execute those prompts.
-
Usage of a different Large Language Model. That would definitely require exploration and testing. Another option is exploring Open Source models, requiring the extra consideration of costs of model hosting in a SageMaker endpoint and leveraging features, like Asynchronous Inferencing or Batch Inferencing to reduce costs.
-
Train a Text Classification model. One idea that can be explored is fine-tuning a Text Classification model trained over Legal data, using many examples (some tens or even hundreds) of clauses per clause type. Since clause classification requires understanding of the hierarchical context of clause (i.e the parent clauses), such a switch of strategy would require either applying advanced text parsing algorithms or still relying on a Large Language Model to extract a contextualized (perhaps summarized) content of a clause together with its parent clauses.
-
Experiment other few-shot text classification ideas. Few-shot text classification approaches can be useful when there are limited examples per clause type, as was the case during development. Here, a Language Model would be used to extract the embeddings from both the clause examples and the clause to be classified. There are different options for a Language Model, like Titan Embeddings (available on Amazon Bedrock), BERT (ideally pretrained on Legal Data - examples) or RoBERTa. Similarly to the previous item, this one also requires leveraging the understanding of the contract hierarchy. Two possible ideas to leverage the embeddings:
-
Train classifier over embeddings. Embeddings are basically features. The idea is to train a simple classifier logistic regression, SVM or MLP on top of these features using your few labeled examples.
-
Use embeddings for prototypical networks. Calculate prototype vectors for each class by averaging the embeddings of the few examples per types. Then classify new examples by finding the closest prototype vector.
-
Amazon Bedrock supports fine-tuning Amazon Nova models using labeled proprietary data to enhance performance for specific use cases. Fine-tuning is particularly suitable for:
-
Niche tasks in a particular domain (such as legal contract analysis)
-
Aligning model outputs with specific requirements or workflows
-
Improving results across a wide range of tasks
-
Meeting tight latency requirements
For this prototype, fine-tuning could be applied to:
-
Improve clause classification accuracy by training on curated clause-type pairs specific to the contract domain
-
Enhance compliance evaluation by training on question-answer pairs from validated assessments
-
Optimize preprocessing by training on contract structure patterns
Amazon Bedrock now offers on-demand deployment for customized models, providing flexibility through pay-as-you-go token-based pricing without requiring provisioned throughput commitments. This makes fine-tuning more accessible for production deployments with variable workloads.
Several approaches can reduce operational costs:
Prompt Caching
Amazon Bedrock offers prompt caching, which reduces costs by up to 90% and latency by up to 85% by caching frequently used prompts across multiple API calls. This is particularly useful for workloads with long and repeated contexts that are frequently reused for multiple queries.
The prototype already implements prompt caching for clause classification, where the system prompt (containing guideline examples and evaluation criteria) is cached across multiple clause analyses. This is automatically enabled for supported models (Amazon Nova and Claude models).
Additional opportunities for prompt caching optimization:
-
Evaluation prompts that reuse the same evaluation questions for similar clause types
-
Preprocessing prompts that use the same instructions for contract splitting
Model Selection
Choose the appropriate model for each task based on complexity:
-
Use Amazon Nova Lite for simpler tasks (preprocessing, contract type extraction)
-
Use Amazon Nova Pro for complex tasks requiring higher accuracy (classification, evaluation)
-
Consider Amazon Nova Micro for very simple text processing tasks where speed is critical
Batch Processing
For non-real-time workloads, consider:
-
Processing multiple contracts in batches during off-peak hours
-
Using SageMaker Asynchronous Inference for custom models
-
Implementing queue-based processing to optimize throughput
To improve response times:
Latency-Optimized Inference
Amazon Bedrock offers latency-optimized inference that delivers faster response times through purpose-built AI chips like AWS Trainium2 and advanced software optimizations, with some models achieving significant improvements in time to first token and output tokens per second.
Model Distillation
Amazon Bedrock’s model distillation feature allows distilling knowledge from larger, more capable models into smaller, faster models while maintaining accuracy. This can be particularly useful for:
-
Creating faster versions of classification models
-
Reducing inference costs while maintaining acceptable accuracy
-
Deploying models with lower latency requirements
Parallel Processing
The prototype already implements parallel processing in Step Functions (max_concurrency=10 for evaluation). Consider:
-
Tuning concurrency limits based on Bedrock quotas and performance requirements
-
Implementing adaptive concurrency based on workload patterns
Amazon Bedrock has service limits of allowed requests and tokens per minute, as detailed in Amazon Bedrock quotas documentation. The contract analysis workflow makes many requests to Amazon Bedrock to perform language tasks using foundation models, so that two or more workflows running in parallel would compete for the throughput available on Amazon Bedrock on-demand mode and will start to receive throttling errors. The workflow definition in the Step Functions itself, along with the code, is set with a couple of guardrails around extended timeout limits and retries, but a running workflow might still fail in case the throttling rate goes too high and one Lambda function repeatedly timeouts waiting for a request to be addressed. Depending on the daily demand for contract processing, one of the following actions might need to be done:
-
Limit number of workflows. That can be done by introducing or Lock (or semaphore) to be acquired before proceeding into the Clause Classification step and then released once the Clause Evaluation step is completed, so that other workflows would be kept in a waiting state until they can also acquire a lock and then proceed. There are different approaches for doing so and the following blog post has one example: Controlling concurrency in distributed systems using AWS Step Functions
-
Provisioned Throughput feature. The Provisioned Throughput mode is a feature of Amazon Bedrock for large consistent inference workloads that need guaranteed throughput, where a discounted rate is offered depending on the monthly time commitment. See Provisioned Throughput documentation for more details.
To prevent unexpected costs and maintain budget control, configure AWS Billing and Cost Management alerts:
AWS Budgets: Set up budget alerts to monitor spending thresholds:
-
Create a monthly budget for the prototype with alerts at 50%, 80%, and 100% of the threshold
-
Configure separate budgets for high-cost services (Amazon Bedrock, Step Functions, DynamoDB)
-
Set up forecasted cost alerts to receive notifications when projected costs exceed budget
Cost Anomaly Detection: Enable AWS Cost Anomaly Detection to automatically identify unusual spending patterns:
-
Monitors spending across all services and alerts on anomalies
-
Uses machine learning to detect cost spikes before they impact your budget
-
Particularly useful for detecting unexpected increases in Bedrock API calls or Step Functions executions
Most Expensive Cost Drivers:
Based on the prototype architecture, the primary cost contributors are:
-
Amazon Bedrock: LLM inference costs - this is the dominant cost driver (varies by model - Nova models offer significant savings vs Claude)
-
OpenSearch Serverless: OCU (OpenSearch Compute Units) for the legislation Knowledge Base - fixed cost of ~$350/month minimum if deployed (2 OCUs minimum)
Secondary costs (typically negligible compared to Bedrock):
-
Step Functions: State transitions ($0.025 per 1,000 transitions)
-
DynamoDB: Read/write capacity and storage (on-demand pricing)
-
S3: Storage and data transfer costs
-
Lambda: Invocation count and duration
Recommended Actions:
-
Set up AWS Budgets with email/SNS notifications before deploying to production
-
Enable Cost Anomaly Detection in the AWS Billing console
-
Review AWS Cost Explorer regularly to identify cost trends
-
Tag resources appropriately for cost allocation and tracking
-
Consider using AWS Cost Optimization Hub for recommendations
More information: AWS Budgets and AWS Cost Anomaly Detection
This prototype uses AWS Lambda Powertools for Python (v3) for structured logging throughout the Lambda functions. The Logger utility outputs structured JSON logs containing essential information like log level, message, timestamp, service name, Lambda function details (name, memory, ARN), request ID, and correlation ID.
Some of these logs could inadvertently expose sensitive data present in LLM prompts and responses, including:
-
Contract clause text and analysis results
-
Guideline evaluation questions and answers
-
User-specific contract type information
-
Legislation compliance details
The log_event parameter in Lambda Powertools Logger is disabled by default to prevent sensitive data exposure, but custom logging throughout the application may still capture sensitive information.
Recommendations for production:
-
Review log levels: Sensitive logs are currently at INFO level for debugging. Change to DEBUG or higher in production
-
CloudWatch Logs data protection: Use CloudWatch Logs data protection policies to automatically mask sensitive data. CloudWatch Logs supports managed data identifiers for detecting and masking credentials, financial information, PII, and PHI. Sensitive data is detected and masked when ingested into the log group. Enable audit findings to track when sensitive data is detected
-
Data masking in code: Consider using Lambda Powertools Data Masking utility to erase or mask sensitive fields before logging
-
Log retention policies: Configure appropriate retention periods for CloudWatch Log Groups to balance compliance requirements with cost optimization
For more details, see CloudWatch Logs Data Protection blog post.
Security and Compliance is a shared responsibility between AWS and the customer.
This shared model can help relieve the customer’s operational burden as AWS operates, manages and controls the components from the host operating system and virtualization layer down to the physical security of the facilities in which the service operates.
The customer assumes responsibility and management of the guest operating system (including updates and security patches), other associated application software as well as the configuration of the AWS provided security group firewall. Customers should carefully consider the services they choose as their responsibilities vary depending on the services used, the integration of those services into their IT environment, and applicable laws and regulations. The nature of this shared responsibility also provides the flexibility and customer control that permits the deployment. As shown in the chart below, this differentiation of responsibility is commonly referred to as Security "of" the Cloud versus Security "in" the Cloud.
LLM applications are subject to a novel class of security threats, such as those described in the OWASP Top 10 for LLM Applications.
The security considerations below are not comprehensive. As you move to production, we recommend diving deeper into both the security model of the Amazon Bedrock platform (see Bedrock Security documentation) and security models for LLMs in general.
Prompt injection is an application-level security concern where malicious prompts manipulate LLM behavior. According to the AWS Shared Responsibility Model, AWS secures the underlying infrastructure while customers must secure their applications against prompt injection and other vulnerabilities.
For this contract analysis prototype, prompt injection risks include:
-
Malicious content embedded in uploaded contract documents
-
Adversarial text in user-provided guidelines or evaluation questions
-
Attempts to manipulate clause classification or compliance evaluation results
Mitigation techniques:
-
Associate a guardrail with prompt attack detection to screen both user inputs and model responses for hidden adversarial instructions
-
Implement input validation for contract documents and guideline text before processing
-
Use secure prompt engineering techniques with clear instruction boundaries
-
Monitor and log unusual patterns in agent interactions
Jailbreaking attempts to subvert safety filters built into the LLMs themselves. This is particularly relevant when processing contracts that may contain adversarial language designed to manipulate the analysis results.
Amazon Nova models include built-in safety controls. Additionally, Amazon Bedrock Guardrails can be configured to detect jailbreak attempts through content filters with configurable thresholds.
LLMs may inadvertently reveal confidential data in their responses, leading to unauthorized data access and privacy violations. For this contract analysis prototype:
-
Contract clauses may contain sensitive business terms, financial information, or proprietary details
-
Implement data sanitization for any logs or outputs that might be shared
-
Use Amazon Bedrock Guardrails sensitive information filters to detect and remove PII from user inputs and model responses
-
Review evaluation results before exposing them through the API
Placing additional Guardrails to validate both user input and LLM-generated content is strongly recommended before moving to production. Amazon Bedrock Guardrails provides configurable safeguards including:
-
Content filters: Block harmful text and images with configurable thresholds (NONE, LOW, MEDIUM, HIGH) across six categories including hate, insults, sexual, violence, misconduct, and prompt attacks
-
Prompt attack detection: Detect and filter prompt injections and jailbreak attempts
-
Denied topics: Define topics to avoid within the application context
-
Word filters: Configure custom words or phrases to detect and block
-
Sensitive information filters: Detect and remove PII like names, addresses, credit card numbers
-
Contextual grounding checks: Detect and filter hallucinations that are not factually accurate or irrelevant to the user’s query
-
Automated Reasoning checks: Validate that model responses adhere to logical rules and policies
For this prototype, consider guardrails for:
-
Validating that clause classifications are grounded in the actual contract text
-
Ensuring evaluation responses stay within the scope of compliance assessment
-
Filtering out any inappropriate content in uploaded contracts
-
Detecting when the LLM attempts to generate content outside its intended purpose
Systems or people overly depending on LLMs without oversight may face misinformation, legal issues, and security vulnerabilities due to incorrect or inappropriate content generated by LLMs.
This risk is inherent to all LLM applications. For contract compliance analysis:
-
Extensive testing and validation of the prototype is key
-
Display warnings to end users that outputs are AI-generated and should be reviewed by legal professionals
-
Implement human-in-the-loop review for high-risk contracts
-
Track and audit all AI-generated compliance assessments
-
Establish clear escalation procedures when the LLM confidence is low
The prototype uses five DynamoDB tables that contain sensitive contract and business data. To ensure security and privacy:
The prototype already has Point-in-time recovery (PITR) enabled for all DynamoDB tables. PITR is a fully managed service that provides up to 35 days of recovery points at per-second granularity, enabling restoration to any specific point in time within the recovery period. PITR helps protect against accidental writes or deletions and doesn’t affect performance or API latencies.
The five tables with PITR enabled:
-
Guidelines Table: Contains proprietary clause type definitions and evaluation criteria
-
Clauses Table: Contains analyzed contract clauses with compliance results
-
Jobs Table: Tracks all workflow executions and risk assessments
-
ContractTypes Table: Contains contract type metadata
-
ImportJobs Table: Tracks guideline import operations
For enhanced backup capabilities, consider using AWS Backup with DynamoDB for additional features including:
-
Scheduled backups using backup plans
-
Cross-account and cross-Region copying for disaster recovery
-
Cold storage tiering for cost optimization
-
Tagging for billing and cost allocation
-
Audit backups using AWS Backup Audit Manager
-
WORM (write-once-read-many) protection against inadvertent or malicious deletions
More details: Restoring a DynamoDB table to a point in time
All tables use AWS-managed encryption by default. For enhanced control:
-
Utilize AWS Key Management Service (KMS) with customer-managed keys for encryption of sensitive fields
-
Consider client-side encryption for highly sensitive data in the Guidelines and Clauses tables before storing in DynamoDB
-
Implement fine-grained access control using IAM policy conditions to restrict access to specific tables or items
-
Use VPC endpoints to restrict DynamoDB access from within the VPC where Lambda functions are deployed
-
Enable AWS CloudTrail to log all API calls to DynamoDB tables, including who accessed what data and when
-
Set up Amazon CloudWatch alarms to alert on suspicious activity:
-
Unusual number of read/write operations
-
Throttling events
-
Failed authentication attempts
-
Amazon Cognito supports Multi-Factor Authentication (MFA) to add an additional authentication factor beyond username and password, increasing security for users accessing the contract analysis system.
MFA options available in Cognito:
-
SMS text messages: Sends one-time codes via SMS using Amazon SNS
-
Email messages: Sends one-time codes via email using Amazon SES (requires Plus or Essentials feature plan)
-
TOTP (Time-based One-Time Password): Uses authenticator apps like Google Authenticator or Authy
MFA can be configured as:
-
Optional: Users can choose to enable MFA for their accounts
-
Required: All users must set up MFA before they can sign in
When MFA is required, new users are prompted to register an additional sign-in factor during their first sign-in. The MFA code is valid for the authentication flow session duration set for the app client.
To configure MFA for the user pool, use the SetUserPoolMfaConfig API operation or the Amazon Cognito console.
Additionally, consider enabling Cognito Advanced Security features for:
-
Adaptive authentication based on risk assessment
-
Compromised credentials detection
-
Custom authentication challenges
For more details: Adding MFA to a user pool
AWS has a series of best practices and guidelines around IAM: IAM Best Practices
We recommend exploring Amazon Macie capabilities. Amazon Macie gives you constant visibility of the data security and data privacy of your data stored in Amazon S3. Macie automatically and continually evaluates all of your S3 buckets and alerts you to any unencrypted buckets, publicly accessible buckets, or buckets shared with AWS accounts outside those you have defined in the AWS Organizations.
More information: Amazon Macie. Be mindful of the additional costs to use the service.
Amazon S3 provides a number of security features to consider as you develop and implement your own security policies.
For an in-depth description of best practices around S3, please refer to S3 Security Best Practices.
At a minimum we recommend that you:
-
Ensure that your Amazon S3 buckets use the correct policies and are not publicly accessible;
-
Implement least privilege access;
-
Consider encryption at-rest (on disk);
-
Enforce encryption in-transit by restricting access using secure transport (TLS);
-
Enable object versioning when applicable; and
-
Enable cross-region replication as a disaster recovery strategy.
The prototype already has a 90-day lifecycle expiration policy configured for the contract documents bucket. This automatically deletes uploaded contract files after 90 days to manage storage costs. Organizations may want to adjust this retention period based on their compliance and audit requirements. More details at S3 Lifecycle Management
For the need of preventing deletion of files from S3, MFA Delete can enabled on the corresponding buckets, where the operation could only succeed by providing a MFA Code apart from the security credentials, being an additional layer of security to protect deletion scenarios either by accident or if the security credentials are compromised. More details at MFA Delete
-
Multi AZ: Use multiple Availability Zone deployments so you have high availability.
-
Securing: Use security groups and network ACLs.
-
VPC Flow Logs: Use Amazon CloudWatch to monitor your VPC components and VPN connections.
-
Data Safety: When working with sensitive data it is recommended to access AWS service with VPC endpoint when available.
-
Resource Isolation: When possible, isolate the resources within a VPC different from the default and configure internet access to restrict the network to only known hosts and destinations.
The prototype frontend is designed for local development and does not include production hosting infrastructure. The backend API Gateway already has AWS WAF enabled with AWS Managed Rules Common Rule Set for protection against common web exploits.
|
Important
|
Before hosting the frontend application publicly, perform thorough security testing including penetration testing to identify and address vulnerabilities. |
Frontend Hosting for Production:
-
Amazon S3 + CloudFront (recommended): Industry-standard pattern for production static websites
-
S3 bucket for static file storage with versioning
-
CloudFront distribution for global CDN delivery
-
Origin Access Control (OAC) to secure S3 bucket
-
Custom SSL/TLS certificates via AWS Certificate Manager
-
-
AWS Amplify Hosting (alternative): Simplified managed service
-
Easier setup with fewer configuration steps
-
Built-in CI/CD integration
-
Additional Security Services for Frontend:
-
AWS WAF for CloudFront: Extend WAF protection to the frontend distribution
-
AWS Shield: DDoS protection (Standard is automatic with CloudFront, Advanced available for enhanced protection)
-
AWS Firewall Manager: Centrally manage firewall rules across accounts
-
Exception handling: All application function code needs to be reviewed to execute proper validation logic and exception handling. Not all cases were covered during prototype development.
-
Automated unit, integration and system tests: In the prototype, some unit tests and manual tests were executed but they are by no means comprehensive. Automated test scripts are required to achieve higher code coverage and to ensure quality and robustness of the prototype as it evolves from prototype to production.
-
Code repositories and pipelines: Code repositories allow teams to collaborate and CI/CD pipelines allow frequent small improvements to be deployed. Services such as AWS CodeCommit, AWS CodePipeline and AWS CodeBuild can be used to implement continuous integration and deployment workflows. See AWS Well-Architected Framework - Build and Deployment Management Systems
Proper operational practices are crucial for running the contract analysis system in production. The following recommendations will help ensure reliability, observability, and maintainability.
The ability to properly monitor the system in production is crucial. The prototype uses AWS Lambda Powertools for Python which provides structured logging, but additional monitoring is recommended:
CloudWatch Logs
-
All Lambda functions output structured JSON logs to CloudWatch Logs
-
Configure appropriate log retention periods (e.g., 30 days for operational logs, longer for compliance logs)
-
Use CloudWatch Logs Insights to query and analyze logs across all functions
-
Set up log metric filters to track specific events (errors, timeouts, throttling)
CloudWatch Metrics
-
Monitor key metrics for each service:
-
Lambda: Invocations, Duration, Errors, Throttles, ConcurrentExecutions
-
Step Functions: ExecutionsStarted, ExecutionsFailed, ExecutionTime
-
DynamoDB: ConsumedReadCapacityUnits, ConsumedWriteCapacityUnits, UserErrors, SystemErrors
-
API Gateway: Count, Latency, 4XXError, 5XXError
-
Bedrock: Invocations, ModelInvocationLatency, ClientErrors, ServerErrors
-
-
Create CloudWatch dashboards for real-time visibility into system health
-
Set up CloudWatch alarms for critical metrics with appropriate thresholds
Lambda Powertools Metrics
The prototype uses Lambda Powertools which provides built-in metrics capabilities. Consider enabling:
-
Custom business metrics (contracts processed, clauses analyzed, compliance rates)
-
Cold start tracking
-
Function-specific performance metrics
Refer to CloudWatch Logs documentation and CloudWatch Metrics documentation for more details.
AWS CloudTrail can be used to implement detective controls. CloudTrail records AWS API calls for your account and delivers log files to you for auditing.
For this prototype, CloudTrail captures:
-
All API calls to DynamoDB tables (who accessed what data and when)
-
S3 bucket access (document uploads, downloads, deletions)
-
Lambda function configuration changes
-
Step Functions workflow executions
-
Bedrock model invocations (if data events are enabled)
-
IAM policy changes
-
Cognito user pool modifications
Recommendations:
-
Enable CloudTrail in all regions where the prototype is deployed
-
Configure CloudTrail to deliver logs to a dedicated S3 bucket with encryption
-
Enable log file validation to detect tampering
-
Set up CloudWatch Logs integration for real-time analysis
-
Use CloudTrail Lake for advanced querying across multiple accounts and regions
More details at CloudTrail User Guide
AWS X-Ray is a service that collects data about requests that your application serves, and provides tools to view, filter, and gain insights into that data to identify issues and opportunities for optimization.
The prototype uses AWS Lambda Powertools which includes built-in X-Ray tracing support through the @tracer.capture_lambda_handler() and @tracer.capture_method() decorators, though X-Ray tracing is not currently enabled in the prototype.
Benefits of X-Ray for this prototype:
-
Trace requests across the entire workflow (API Gateway → Lambda → Step Functions → Bedrock)
-
Identify performance bottlenecks in the contract analysis pipeline
-
Visualize service dependencies and call patterns
-
Detect errors and exceptions with detailed stack traces
-
Analyze latency distribution across different workflow steps
Implementation for production:
-
Enable X-Ray tracing for API Gateway stages
-
Enable active tracing for all Lambda functions
-
Enable X-Ray tracing for Step Functions state machines
-
Add Lambda Powertools Tracer decorators to automatically capture subsegments for Bedrock calls and DynamoDB operations
More details at X-Ray Developer Guide and Lambda Powertools Tracer documentation
VPC Flow Logs capture network flow information for a VPC, subnet, or network interface and stores it in Amazon CloudWatch Logs. This can help troubleshoot network issues and detect suspicious activity.
For this prototype, enable VPC Flow Logs if Lambda functions or Amazon Bedrock AgentCore runtimes are deployed in a VPC to:
-
Monitor network traffic patterns
-
Detect unauthorized access attempts
-
Troubleshoot connectivity issues
-
Analyze traffic for security and compliance
Note: This prototype deploys the legislation compliance agent using Amazon Bedrock AgentCore with PUBLIC network mode, which does not require VPC configuration. If you reconfigure the agent to use VPC mode (for accessing private resources like internal databases or APIs), enable VPC Flow Logs for the agent’s subnets to monitor network activity.
More information at VPC Flow Logs and AgentCore VPC Configuration
Set up CloudWatch alarms for critical operational metrics:
High Priority Alarms:
-
Step Functions workflow failures (ExecutionsFailed > 0)
-
Lambda function errors (Errors > threshold)
-
API Gateway 5XX errors (5XXError > threshold)
-
DynamoDB throttling (UserErrors > 0)
-
Bedrock throttling or errors
-
AgentCore runtime errors or invocation failures (bedrock-agentcore namespace)
Medium Priority Alarms:
-
Lambda function duration approaching timeout
-
DynamoDB consumed capacity approaching limits
-
S3 bucket size growth rate
-
API Gateway 4XX errors (potential client issues)
-
AgentCore session latency exceeding threshold
-
AgentCore token usage approaching limits
Low Priority Alarms:
-
Lambda cold starts exceeding threshold
-
Unusual traffic patterns
-
Cost anomalies
-
AgentCore session count anomalies
Configure alarm actions to send notifications via Amazon SNS to email, SMS, or integrate with incident management systems like PagerDuty or Opsgenie.
Note: For the legislation compliance agent using Amazon Bedrock AgentCore, monitor metrics in the bedrock-agentcore CloudWatch namespace including session count, latency, duration, token usage, and error rates. Enable CloudWatch Transaction Search to view detailed traces and spans for agent execution analysis.
To keep certain resources when you delete a stack, use the DeletionPolicy attribute in your AWS CloudFormation template. More information at:
This is about the number of instances available to serve requests at a given time. AWS Lambda offers two types of concurrency controls:
-
Reserved concurrency: We recommend that you properly configure a reserved concurrency for the lambdas used on production. Reserved concurrency guarantees the maximum number of concurrent instances for a function. This prevents your lambda from running concurrent functions in exceed of the threshold you set, but also guarantees that other functions won’t prevent your function from scaling. For more details, refer to Lambda Concurrency
-
Provisioned concurrency: Ensure Lambda function instances have their internal environment already initialized prior to serving requests, ensuring consistent latency to all requests. More details at Provisioned Concurrency
Our recommendation is to analyze each type of control against the throughput requirements of all the Lambda functions defined in each region.
There are at least two possible approaches to validate the input provided to a new workflow execution:
-
Choice state, to add conditional logic to check the presence and format of each required input attributes. More details at Choice State
-
Add initial state that invokes a Lambda function to perform the input validation
Along with this document, you can also find the artifacts, such as infrastructure-as-code definitions, scripts and source code at this url: GitHub Repository
AWS Cloud Development Kit (AWS CDK) is an open-source software development framework to define your cloud application resources using familiar programming languages. AWS CDK provisions your resources in a safe, repeatable manner through AWS CloudFormation. It also allows you to compose and share your own custom constructs incorporating your organization’s requirements, helping you expedite new projects.
The AWS CDK Toolkit is a command line tool for interacting with CDK apps. It enables developers to synthesize artifacts such as AWS CloudFormation templates, deploy stacks to development AWS accounts, and diff against a deployed stack to understand the impact of a code change.
AWS CloudFormation is a service that helps you model and set up your AWS resources so that you can spend less time managing those resources and more time focusing on your applications that run in AWS. You create templates (YAML files) that describe the AWS resources that you want, and AWS CloudFormation takes care of provisioning these resources for you.
The following sections describe the necessary steps to deploy the full solution into an AWS account. To ensure proper installation of the overall solution, please follow the sections on the order they are listed. The prototype was developed in MacOS environment, so you might need to check for corresponding system commands if you plan to run the setup from a Windows environment.
The prototype prototype comprise these folders:
-
backend- contains the code assets for the Contract Analysis workflow -
frontend- contains the code assets for the web application
Please follow the remaining sections in the order they are listed, as they correspond to the recommended setup sequence.
Follow the instructions detailed in the README file located in the frontend folder.
|
Important
|
It is definitely recommended to perform a thorough security testing, including pen-tests, before hosting this Frontend application publicly. The work is provided "AS IS" without warranties or conditions of any kind, either express or implied, including warranties or conditions of merchantability. You bear the risks of using the package. |
You can prevent stacks from being accidentally deleted by enabling termination protection on the stack. If a user attempts to delete a stack with termination protection enabled, the deletion fails and the stack, including its status, remains unchanged. For more details on how to enable the deletion protection, refer to termination_protection configuration, documented at AWS CDK Stack documentation.
CloudFormation automates the removal of AWS resources created in this prototype. To delete all resources:
-
Delete CheckLegislationStack first (if deployed): This stack must be deleted before MainBackendStack due to dependencies
-
Go to the CloudFormation console
-
Select the
CheckLegislationStackstack -
Click Delete and confirm
-
-
Delete MainBackendStack:
-
Select the
MainBackendStackstack -
Click Delete and confirm
-
Important notes:
-
S3 buckets and DynamoDB tables are configured with
RemovalPolicy.DESTROY, so they will be automatically deleted along with their contents -
If you want to preserve data before deletion, export DynamoDB tables or download S3 objects first
-
CloudWatch Log Groups may persist after stack deletion - delete them manually if needed
-
ECR repositories created for the AgentCore agent image may persist - delete them manually from the ECR console if needed
-
The deletion process may take several minutes to complete
Alternative: Using AWS CLI
You can also delete stacks using the AWS CLI:
# Delete CheckLegislationStack first (if deployed)
aws cloudformation delete-stack --stack-name CheckLegislationStack
# Wait for deletion to complete
aws cloudformation wait stack-delete-complete --stack-name CheckLegislationStack
# Delete MainBackendStack
aws cloudformation delete-stack --stack-name MainBackendStack









