Skip to content

midopooler/EdgeRAG-iOS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Edge PDF RAG iOS

A native iOS application for document-based question answering using Retrieval Augmented Generation (RAG). The app combines Couchbase Lite vector search with AI-powered language models to enable intelligent conversations about your PDF documents.

Overview

This application allows users to upload PDF documents, automatically processes and indexes them using vector embeddings, and then answer questions about the content using AI. The system retrieves relevant context from the documents using semantic search and generates natural language responses.

Key Features

  • PDF Document Processing - Upload and automatically process PDF documents into searchable chunks
  • Vector Search - Fast semantic search powered by Couchbase Lite's vector distance functions
  • Multiple AI Providers - Support for Google Gemini API and local on-device models
  • Secure Credential Storage - API keys stored securely in iOS Keychain
  • Background Downloads - Download large language models with progress tracking and notifications
  • Modern iOS Design - Built entirely with SwiftUI and Swift Concurrency

Architecture

The application follows the MVVM (Model-View-ViewModel) pattern with clear separation of concerns:

Edge-PDF-RAG-iOS/
├── Models/              # Data models
│   ├── Document.swift
│   ├── Chunk.swift
│   └── LocalModel.swift
├── Services/            # Business logic layer
│   ├── DatabaseManager.swift
│   ├── KeychainManager.swift
│   ├── PDFProcessor.swift
│   ├── SentenceEmbeddingProvider.swift
│   ├── LLMService.swift
│   └── BackgroundDownloadManager.swift
├── ViewModels/          # View state management
│   ├── ChatViewModel.swift
│   ├── DocsViewModel.swift
│   ├── CredentialsViewModel.swift
│   └── LocalModelsViewModel.swift
└── Views/               # SwiftUI interface
    ├── ChatView.swift
    ├── DocsView.swift
    ├── CredentialsView.swift
    ├── LocalModelsView.swift
    ├── LLMProviderView.swift
    └── DownloadProgressBanner.swift

Technology Stack

  • Language: Swift 6.0
  • UI Framework: SwiftUI
  • Database: Couchbase Lite 3.2+
  • PDF Processing: PDFKit
  • AI Integration: Google Gemini API
  • Async Programming: Swift Concurrency (async/await)
  • Security: iOS Keychain Services

Requirements

  • iOS 17.0 or later
  • Xcode 16.4 or later
  • Swift 6.0
  • Google Gemini API key (free tier available)

Setup

  1. Clone the repository
  2. Open Edge-PDF-RAG-iOS.xcodeproj in Xcode
  3. Add the Couchbase Lite dependency:
    • File > Add Package Dependencies
    • Enter: https://github.com/couchbase/couchbase-lite-ios
    • Version: 3.2.0 or later
    • Select: CouchbaseLiteSwift
  4. Build and run the project

Getting Started

Configure API Access

  1. Launch the app
  2. Tap the menu button (three dots)
  3. Select "Edit Credentials"
  4. Enter your Gemini API key (get one at https://aistudio.google.com/apikey)
  5. Save the credentials

Upload Documents

  1. Navigate to the Documents tab
  2. Tap "Upload PDF"
  3. Select a PDF file from your device
  4. Wait for processing to complete

Ask Questions

  1. Go to the Chat tab
  2. Type your question about the uploaded documents
  3. The system will search for relevant content and generate an answer

How It Works

Document Processing Flow

  1. User uploads a PDF document
  2. PDFProcessor extracts text content
  3. Text is split into smaller chunks
  4. Sentence embeddings are generated for each chunk
  5. Chunks and embeddings are stored in Couchbase Lite

Question Answering Flow

  1. User asks a question
  2. System generates an embedding for the question
  3. Vector similarity search finds the most relevant document chunks
  4. Retrieved chunks are sent as context to the LLM
  5. LLM generates a response based on the context
  6. Response is displayed to the user

Local Models

The app supports downloading and using local language models for offline inference:

  • Qwen2.5 0.5B - Fast and lightweight (500MB)
  • Qwen2.5 1.5B - Balanced performance (1.5GB)
  • Qwen2.5 3B - Higher quality (3GB)
  • Phi 4 Mini - Microsoft's compact model (2.5GB)

Note: Local model inference requires MediaPipe LiteRT, which is expected to be available for iOS in early 2025. The download and model management features are fully functional.

API Keys

The application requires a Google Gemini API key for cloud-based inference. Keys are stored securely in the iOS Keychain and never leave your device except for authorized API calls.

You can obtain a free Gemini API key from Google AI Studio: https://aistudio.google.com/apikey

Development

Running Tests

xcodebuild test -scheme Edge-PDF-RAG-iOS \
  -destination 'platform=iOS Simulator,name=iPhone 16 Pro'

Building from Command Line

xcodebuild -scheme Edge-PDF-RAG-iOS \
  -destination 'platform=iOS Simulator,name=iPhone 16 Pro' \
  clean build

Known Limitations

  • Embeddings use a simplified implementation. Production deployments should use CoreML models
  • Local model inference awaits MediaPipe LiteRT iOS release
  • Large PDFs may take time to process depending on device capabilities

Future Enhancements

  • CoreML-based sentence embeddings for improved performance
  • On-device LLM inference when MediaPipe LiteRT becomes available
  • Support for additional document formats (DOCX, TXT, etc.)
  • Multi-document conversations
  • Export conversation history

About

A native iOS application for document-based question answering using Retrieval Augmented Generation (RAG). The app combines Couchbase Lite vector search with AI-powered language models to enable intelligent conversations about your PDF documents.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages