Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
148 changes: 148 additions & 0 deletions content/docs/guides/paperless-ngx.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
---
title: Secure Paperless-ngx with Pomerium
sidebar_label: Paperless-ngx
lang: en-US
keywords:
[
pomerium,
paperless-ngx,
paperless,
sso,
document management,
identity aware proxy,
self-hosted,
django,
]
description: Put self-hosted Paperless-ngx behind Pomerium so every request is authenticated and authorized at the front door before it reaches your documents.
# cSpell:ignore paperless ngx
---

import TabItem from '@theme/TabItem';
import Tabs from '@theme/Tabs';

import Config from '/content/examples/guides/paperless-ngx/config.yaml.md';
import Compose from '/content/examples/guides/paperless-ngx/docker-compose.yaml.md';

# Secure Paperless-ngx with Pomerium

## What this guide does

Put a self-hosted [Paperless-ngx](https://docs.paperless-ngx.com/) instance behind Pomerium so every request is authenticated against your identity provider (IdP) and checked against the route policy before it reaches Paperless-ngx; unauthenticated requests are blocked at the front door. You get single sign-on (SSO), group-based policy, and an audit log of who reached the route. Paperless-ngx keeps its own login and per-user document permissions on top.

```mermaid
flowchart LR
Browser --> Pomerium["Pomerium<br/>SSO + route policy"]
Pomerium -.->|"sign in"| IdP[Identity provider]
Pomerium --> Paperless["Paperless-ngx<br/>own login + permissions"]
```

Paperless-ngx is a document management system that stores scanned and digitized records, often a household's or a company's most sensitive paperwork: tax filings, contracts, medical records, and IDs. That makes it a high-value target to keep off the open internet.

## When to use this guide

Use it when you run self-hosted Paperless-ngx and want only people from your organization to reach it. This guide layers Pomerium in front of Paperless-ngx's stock login; if you want Pomerium to sign users into Paperless-ngx directly, Paperless-ngx supports trusted-header SSO natively (see [Next steps](#next-steps)).

## Prerequisites

- [Docker](https://docs.docker.com/install/) and [Docker Compose](https://docs.docker.com/compose/install/)
- For the Pomerium Zero path: a [Pomerium Zero](https://console.pomerium.app) account with its Pomerium instance running locally via the [Quickstart](/docs/get-started/quickstart) Compose file; the route uses the starter domain that comes with it
- For the Pomerium Core path: a domain you control for the route (this guide uses `paperless.yourdomain.com`), with DNS pointed at the host running Pomerium and ports 80 and 443 reachable so `autocert` can provision certificates; the Compose file below runs Pomerium itself

This guide was last tested with Paperless-ngx 2.18.4 and Pomerium 0.32.7.

:::tip Prefer to self-host the identity provider?

This guide uses the hosted authenticate service so you don't have to run an IdP. To run your own instead, follow [Keycloak + Pomerium](/docs/integrations/user-identity/oidc) and swap the `authenticate_service_url` / `idp_*` settings into the config below.

:::

## Configure Pomerium

<Tabs queryString="type">
<TabItem value="zero" label="Pomerium Zero" default>

In the [Zero Console](https://console.pomerium.app):

1. Create a **Route**. In **From**, enter `https://paperless.<your-starter-domain>`; in **To**, enter `http://paperless:8000`.
2. On the route's settings, enable **Preserve Host Header**. Paperless-ngx is a Django application that validates the incoming `Host` against its `ALLOWED_HOSTS` (derived from `PAPERLESS_URL`) and uses it for cross-site request forgery (CSRF) checks, so the original host must reach Paperless-ngx unchanged.
3. Set the policy to scope access to who should reach Paperless-ngx (for example, **Any Authenticated User** or a specific group or domain).

</TabItem>
<TabItem value="core" label="Pomerium Core">

Create a `config.yaml`. It routes `paperless.yourdomain.com` to the Paperless-ngx container and preserves the host header so Django's `ALLOWED_HOSTS` and CSRF checks pass.

<Config />

Replace `paperless.yourdomain.com` with your domain and `you@example.com` with the email (or switch to a group or domain match) that should be allowed through.

</TabItem>
</Tabs>

## Configure Paperless-ngx

Paperless-ngx runs as a Django application backed by PostgreSQL and Redis. Pomerium terminates TLS at the front door, so Paperless-ngx serves plain HTTP on the internal Docker network. The key settings in the Compose file below:

- `PAPERLESS_URL: https://paperless.yourdomain.com`: Paperless-ngx derives Django's `ALLOWED_HOSTS` and `CSRF_TRUSTED_ORIGINS` from this. It must equal the route's **From** URL, or Django answers `HTTP 400` to every request that arrives behind the proxy.
- `PAPERLESS_REDIS` and the `PAPERLESS_DB*` values: point Paperless-ngx at the Redis broker and PostgreSQL database that ship in the same Compose file.
- `PAPERLESS_SECRET_KEY`: Django's signing key. Generate your own with `openssl rand -base64 48`; never reuse the placeholder.
- `PAPERLESS_ADMIN_USER` / `PAPERLESS_ADMIN_PASSWORD`: bootstrap the first superuser on first startup.

The Compose file runs Pomerium Core alongside Paperless-ngx, PostgreSQL, and Redis. For Zero, drop the Core `pomerium` service, keep `paperless`, `db`, and `redis` on `paperless-internal`, and attach your Zero `pomerium` service (the [Quickstart](/docs/get-started/quickstart) Compose service with your `POMERIUM_ZERO_TOKEN`) to `paperless-internal` so it can resolve `paperless` by name. On the Zero path, also set `PAPERLESS_URL` to the route's **From** URL, `https://paperless.<your-starter-domain>`, so Django's host and CSRF checks accept the proxied requests.

<Compose />

## Run the stack

Start the stack:

```bash
docker compose up -d
```

Paperless-ngx runs database migrations and builds its search index on first boot, so the container can take a couple of minutes before it answers requests. Watch `docker compose logs -f paperless` until it reports that the web server is listening.

## Verify the setup

1. **The route requires authentication.** In a fresh browser, open `https://paperless.yourdomain.com`. You should be redirected to sign in through Pomerium, not straight to Paperless-ngx.
2. **An allowed user reaches Paperless-ngx.** Sign in with a user your policy allows. Pomerium redirects you back and Paperless-ngx's own sign-in page loads.

![The Paperless-ngx sign-in page reached through Pomerium](./img/paperless-ngx/paperless-login.png)

3. **Sign in to Paperless-ngx.** Use the admin account you bootstrapped. Paperless-ngx authenticates you and lands you on its document dashboard, served through Pomerium.
4. **A request that bypasses Pomerium fails.** In the Compose file above, Paperless-ngx sits on an internal-only Docker network with no published host ports, so a direct probe of the upstream cannot resolve or connect; the only path in is through Pomerium.

When you're done testing, stop the stack with `docker compose down`. Add `-v` only if you mean to delete the database, media, Redis, and credential volumes.

## What Pomerium protects — and what it doesn't

Everything in this guide lives on one host behind one route, so Pomerium's SSO and policy stand in front of every way into Paperless-ngx:

| Access channel | What gates it | Credential the client presents |
| --- | --- | --- |
| Web interface in a browser | Pomerium route policy, then Paperless-ngx's login | Pomerium SSO session, then a Paperless-ngx login |
| REST API | The same Pomerium route; API clients can't complete browser SSO, so the route blocks them | Paperless-ngx API token, on a path you deliberately provide |
| Mobile scanner apps | The same Pomerium route, with the same constraint as the API | Stored Paperless-ngx credentials or API token |

API clients and scanner apps authenticate to Paperless-ngx directly and can't complete browser SSO, so they don't work through this route. If you need them, the options are a separate [public access](/docs/reference/routes/public-access) route (not identity-protected, so Paperless-ngx's own auth becomes the only control), [a TCP tunnel](/docs/capabilities/non-http), or access over the private network. API clients that can send custom headers have one more option on Pomerium Zero or Enterprise: authenticate to this protected route with a [Pomerium service account](/docs/capabilities/service-accounts) token, with Paperless-ngx's API token authorizing the call as usual.

## Common failure modes

- **`HTTP 400 Bad Request` on every page.** `PAPERLESS_URL` doesn't match the route's **From** URL, so Django rejects the host. Set `PAPERLESS_URL` to exactly `https://paperless.yourdomain.com` and make sure `preserve_host_header` is enabled on the route.
- **Redirects or links point at the container name or the wrong host.** `preserve_host_header` isn't set, so Paperless-ngx sees `paperless:8000` instead of the public name. Enable it on the route.
- **`502` or `503` right after `docker compose up`.** Paperless-ngx hasn't finished its first-boot migrations and search-index build yet. Wait until `docker compose logs -f paperless` shows the web server listening; first boot routinely takes a couple of minutes.
- **CSRF verification failures when signing in or uploading.** The browser's `Origin` doesn't match Django's `CSRF_TRUSTED_ORIGINS`. This is the same root cause as the `400` above: keep `PAPERLESS_URL` and the route host identical, over HTTPS.

## Security considerations

- **Don't expose Paperless-ngx directly**: only Pomerium should reach `paperless:8000`. The Compose file keeps Paperless-ngx (and its PostgreSQL and Redis) on an internal-only Docker network with no published host ports, so the only path in is through Pomerium and the policy can't be bypassed.
- Scope the route policy (group or domain) to who should have any access to Paperless-ngx at all. Paperless-ngx's per-user document permissions still apply on top of that.
- Paperless-ngx exposes an API and admin interface under the same host as the web interface. Because the whole host sits behind Pomerium, those surfaces inherit the same SSO and policy; don't add a second public route that bypasses them.
- Generate a unique `PAPERLESS_SECRET_KEY` and strong database and admin passwords. The placeholders in this guide are examples, not safe defaults.

## Next steps

- **Let Pomerium sign users in.** Paperless-ngx supports trusted-header SSO ([`PAPERLESS_ENABLE_HTTP_REMOTE_USER`](https://docs.paperless-ngx.com/configuration/)). Set `pass_identity_headers: true` on the route so Pomerium forwards the verified identity as an `X-Pomerium-Claim-*` header, then point `PAPERLESS_HTTP_REMOTE_USER_HEADER_NAME` at that header so Paperless-ngx logs the user in directly instead of keeping a separate login. Only do this when Pomerium is the sole path in and strips any client-supplied copy of that header.
- [Build policies](/docs/get-started/fundamentals/zero/zero-build-policies)
- [Custom domains](/docs/capabilities/custom-domains)
- [Self-host the identity provider](/docs/integrations/user-identity/oidc)
20 changes: 20 additions & 0 deletions content/examples/guides/paperless-ngx/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Pomerium Core configuration for Paperless-ngx. Uses the hosted authenticate
# service, so you don't run your own identity provider. To self-host the IdP, see
# the Keycloak guide: https://www.pomerium.com/docs/integrations/user-identity/oidc
authenticate_service_url: https://authenticate.pomerium.app

# Obtain TLS certificates automatically from Let's Encrypt.
autocert: true

routes:
- from: https://paperless.yourdomain.com
to: http://paperless:8000
# Paperless-ngx is a Django app: it validates the Host header against
# ALLOWED_HOSTS (derived from PAPERLESS_URL) and uses it for CSRF checks, so
# forward the original Host unchanged or it answers HTTP 400.
preserve_host_header: true
policy:
- allow:
or:
- email:
is: you@example.com
22 changes: 22 additions & 0 deletions content/examples/guides/paperless-ngx/config.yaml.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
```yaml title="config.yaml"
# Pomerium Core configuration for Paperless-ngx. Uses the hosted authenticate
# service, so you don't run your own identity provider. To self-host the IdP, see
# the Keycloak guide: https://www.pomerium.com/docs/integrations/user-identity/oidc
authenticate_service_url: https://authenticate.pomerium.app

# Obtain TLS certificates automatically from Let's Encrypt.
autocert: true

routes:
- from: https://paperless.yourdomain.com
to: http://paperless:8000
# Paperless-ngx is a Django app: it validates the Host header against
# ALLOWED_HOSTS (derived from PAPERLESS_URL) and uses it for CSRF checks, so
# forward the original Host unchanged or it answers HTTP 400.
preserve_host_header: true
policy:
- allow:
or:
- email:
is: you@example.com
```
75 changes: 75 additions & 0 deletions content/examples/guides/paperless-ngx/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
services:
pomerium:
image: pomerium/pomerium:v0.32.7@sha256:e10d1d267af24f581157f485d9b0bc08469e2428675b696a08e42ceb09b2279c
volumes:
- ./config.yaml:/pomerium/config.yaml:ro
- pomerium-cache:/data
ports:
- 443:443
- 80:80
# Pomerium is the only service on both networks: the default network for public
# traffic, and the internal-only network to reach Paperless. This bridge is the
# single path in, so the policy can't be bypassed.
networks:
default: {}
paperless-internal: {}
restart: always

paperless:
image: ghcr.io/paperless-ngx/paperless-ngx:2.18.4@sha256:3421ebe06ed27662d014046cf5089e612de853aae0c676a2bc72f73b38080e57
depends_on:
- db
- redis
environment:
PAPERLESS_REDIS: redis://redis:6379
PAPERLESS_DBHOST: db
PAPERLESS_DBUSER: paperless
PAPERLESS_DBPASS: change-this-database-password
PAPERLESS_DBNAME: paperless
# Generate your own: openssl rand -base64 48
PAPERLESS_SECRET_KEY: change-this-to-a-long-random-string
# Must equal the public route host below, or Django answers HTTP 400 behind
# the proxy (ALLOWED_HOSTS / CSRF_TRUSTED_ORIGINS are derived from this).
PAPERLESS_URL: https://paperless.yourdomain.com
# Bootstraps the first superuser on initial startup only.
PAPERLESS_ADMIN_USER: admin
PAPERLESS_ADMIN_PASSWORD: change-this-admin-password
volumes:
- paperless-data:/usr/src/paperless/data
- paperless-media:/usr/src/paperless/media
networks:
- paperless-internal
restart: always

db:
image: postgres:16-alpine@sha256:16bc17c64a573ef34162af9298258d1aec548232985b33ed7b1eac33ba35c229
environment:
POSTGRES_DB: paperless
POSTGRES_USER: paperless
POSTGRES_PASSWORD: change-this-database-password
volumes:
- paperless-db:/var/lib/postgresql/data
networks:
- paperless-internal
restart: always

redis:
image: redis:7-alpine@sha256:6ab0b6e7381779332f97b8ca76193e45b0756f38d4c0dcda72dbb3c32061ab99
volumes:
- paperless-redis:/data
networks:
- paperless-internal
restart: always

networks:
# Internal-only: no route to the outside, so Paperless, Postgres, and Redis are
# reachable only via Pomerium, which is the lone service bridging it to default.
paperless-internal:
internal: true

volumes:
pomerium-cache:
paperless-data:
paperless-media:
paperless-db:
paperless-redis:
77 changes: 77 additions & 0 deletions content/examples/guides/paperless-ngx/docker-compose.yaml.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
```yaml title="docker-compose.yaml"
services:
pomerium:
image: pomerium/pomerium:v0.32.7@sha256:e10d1d267af24f581157f485d9b0bc08469e2428675b696a08e42ceb09b2279c
volumes:
- ./config.yaml:/pomerium/config.yaml:ro
- pomerium-cache:/data
ports:
- 443:443
- 80:80
# Pomerium is the only service on both networks: the default network for public
# traffic, and the internal-only network to reach Paperless. This bridge is the
# single path in, so the policy can't be bypassed.
networks:
default: {}
paperless-internal: {}
restart: always

paperless:
image: ghcr.io/paperless-ngx/paperless-ngx:2.18.4@sha256:3421ebe06ed27662d014046cf5089e612de853aae0c676a2bc72f73b38080e57
depends_on:
- db
- redis
environment:
PAPERLESS_REDIS: redis://redis:6379
PAPERLESS_DBHOST: db
PAPERLESS_DBUSER: paperless
PAPERLESS_DBPASS: change-this-database-password
PAPERLESS_DBNAME: paperless
# Generate your own: openssl rand -base64 48
PAPERLESS_SECRET_KEY: change-this-to-a-long-random-string
# Must equal the public route host below, or Django answers HTTP 400 behind
# the proxy (ALLOWED_HOSTS / CSRF_TRUSTED_ORIGINS are derived from this).
PAPERLESS_URL: https://paperless.yourdomain.com
# Bootstraps the first superuser on initial startup only.
PAPERLESS_ADMIN_USER: admin
PAPERLESS_ADMIN_PASSWORD: change-this-admin-password
volumes:
- paperless-data:/usr/src/paperless/data
- paperless-media:/usr/src/paperless/media
networks:
- paperless-internal
restart: always

db:
image: postgres:16-alpine@sha256:16bc17c64a573ef34162af9298258d1aec548232985b33ed7b1eac33ba35c229
environment:
POSTGRES_DB: paperless
POSTGRES_USER: paperless
POSTGRES_PASSWORD: change-this-database-password
volumes:
- paperless-db:/var/lib/postgresql/data
networks:
- paperless-internal
restart: always

redis:
image: redis:7-alpine@sha256:6ab0b6e7381779332f97b8ca76193e45b0756f38d4c0dcda72dbb3c32061ab99
volumes:
- paperless-redis:/data
networks:
- paperless-internal
restart: always

networks:
# Internal-only: no route to the outside, so Paperless, Postgres, and Redis are
# reachable only via Pomerium, which is the lone service bridging it to default.
paperless-internal:
internal: true

volumes:
pomerium-cache:
paperless-data:
paperless-media:
paperless-db:
paperless-redis:
```
Loading