"
+```
+
+---
+
+## Self-Review notes
+
+- **Spec coverage:** backend reuse (Task 1 verifies), new component (Task 3), registry wiring (Task 4), page render (Task 5), label strings (Task 2), testing (Tasks 1 + 6). All spec sections covered.
+- **Type consistency:** component named `ProcessingStatusFilter` in Tasks 3 and 4; query field `has_detections` in Tasks 1, 4, 5; STRING keys `PROCESSED` / `NOT_PROCESSED` in Tasks 2 and 3.
+- **Out of scope (later PRs):** date range, site, device filters — see design doc.
diff --git a/docs/claude/reference/captures-processed-count-strategies.md b/docs/claude/reference/captures-processed-count-strategies.md
new file mode 100644
index 000000000..85cfe4e35
--- /dev/null
+++ b/docs/claude/reference/captures-processed-count-strategies.md
@@ -0,0 +1,132 @@
+# Captures list: `processed` / `has_detections` COUNT strategies
+
+**Created:** 2026-05-29 (PR #1326). **Status:** reference — records a strategy that
+was prototyped, benchmarked, and deliberately *not* shipped.
+
+## Context
+
+The captures list (`SourceImageViewSet`, `ami/main/api/views.py`) supports two
+existence filters:
+
+- `?processed=true|false` — capture has *any* `Detection` row, including the null
+ markers (`NULL_DETECTIONS_FILTER = Q(bbox__isnull=True) | Q(bbox=[])`) that record
+ a "processed, found nothing" result.
+- `?has_detections=true|false` — capture has a *real* detection (bounding box
+ present). Null markers excluded.
+
+Both translate to an `EXISTS` / `NOT EXISTS` subquery against `main_detection`. The
+**page of rows** is cheap (the `LIMIT` prunes early), but the **pagination COUNT**
+has no `LIMIT`, so `NOT EXISTS` becomes an anti-join over the whole source-image
+table.
+
+## What shipped (PR #1326)
+
+- `?processed=` / `?has_detections=` filters.
+- Sortable `last_processed` column (correlated subquery: most recent detection
+ `created_at`).
+- Index `det_srcimg_created_idx` on `Detection(source_image, -created_at)`
+ (migration `0088`) — supports the `last_processed` **sort**.
+- The pagination COUNT uses the **default DRF count** (the plain anti-join). No
+ custom count strategy.
+
+## The strategy that was NOT shipped: count by subtraction
+
+Prototype (reverted): a `SourceImagePagination` whose `get_count` computed the
+existence-filter count without the anti-join:
+
+```
+total = COUNT(*) over the project/event/deployment/collection-scoped captures
+processed_count = COUNT(DISTINCT source_image_id) off main_detection, scoped to the same captures
+ (has_detections: also .exclude(NULL_DETECTIONS_FILTER))
+
+processed=true -> processed_count
+processed=false -> total - processed_count
+```
+
+Both counts are exact. The cost scales with the number of *detection* rows rather
+than the processed/unprocessed ratio, so it is symmetric (fast in both directions).
+Implementation notes if it is ever revived:
+
+- DRF sets `paginator.request` *after* `get_count()` runs, so `paginate_queryset`
+ must stash `request` + `view` first.
+- The base queryset (scoped, but *without* the processed/has_detections predicate)
+ was rebuilt in the view by applying only `DjangoFilterBackend` — *not* the
+ ordering backend, which would reference the absent `last_processed` annotation.
+- The detection-side count used `Detection.objects.filter(source_image__in=base.values("pk"))`.
+ This is an `IN (subquery)` semi-join; its plan is less predictable than a direct
+ `source_image__project_id=` join (see cold-spike below).
+
+## Why it was reverted
+
+The original justification was a **12.8s** COUNT for `processed=false` on the
+929k-capture project. Deploy-time benchmarking on the Serbia dev box (hardware
+comparable to production) showed that number does not reproduce there.
+
+### Benchmarks
+
+Local dev box (cold, low RAM, 8 GB source-image table not cached) — the numbers
+that originally motivated subtraction:
+
+| project | filter | default anti-join | subtraction |
+|---|---|---|---|
+| 18 / 929k (local) | processed=false | **12.8s** | ~1.7s |
+| 18 / 929k (local) | processed=true | 4.8s | ~1.7s |
+| 18 / 929k (local) | has_detections=false | 11.5s | ~1.9s |
+| 18 / 929k (local) | has_detections=true | 3.5s | 0.2s |
+
+Serbia dev box (cold), real data — the numbers that changed the decision:
+
+| project | filter | default anti-join | subtraction |
+|---|---|---|---|
+| 18 / 929k | processed=false | **1.38s** | 0.58s |
+| 18 / 929k | processed=true | 1.52s | 0.58s |
+| 20 / 105k | processed=true | 0.44s | **7.71s cold** / 0.01s warm |
+| 20 / 105k | processed=false | 0.27s | 0.04s |
+
+Counts matched exactly across both approaches (subtraction is correct):
+project 18 → 17938 / 910996; project 20 → 8517 / 96574 (processed),
+8476 / 96615 (has_detections).
+
+### Findings
+
+1. **The 12.8s was environment-dependent, not algorithmic.** `EXPLAIN (ANALYZE)`
+ for `processed=false` on project 18 (Serbia):
+
+ ```
+ Finalize Aggregate (actual time=1541..1567)
+ -> Parallel Hash Right Anti Join (rows=303665)
+ -> Parallel Seq Scan on main_detection (rows=455239)
+ -> Parallel Hash
+ -> Parallel Seq Scan on main_sourceimage (Filter: project_id = 18)
+ Execution Time: 1609 ms
+ ```
+
+ The anti-join seq-scans the wide source-image table. Serbia's RAM / OS cache /
+ parallel workers do it in ~1.6s; the local box did it in 12.8s cold. Serbia ≈
+ production, so the real-world cost is far smaller than the local measurement.
+
+2. **`det_srcimg_created_idx` is not used by the COUNT** — the anti-join plan
+ ignores it. It only helps the `last_processed` sort. So the index already in the
+ PR does nothing for the count either way.
+
+3. **Subtraction has its own cold-plan risk.** On the *smaller* project 20 the
+ detection-side `IN (subquery)` distinct spiked to 7.71s on first disk touch
+ (cold seq scan of `main_detection`), settling to sub-second warm — *slower* than
+ the 0.44s default for that case. `EXPLAIN` (warm) = 634ms via a nested-loop +
+ pkey memoize + distinct.
+
+Net: subtraction is a modest, real win on the largest project (0.58 vs 1.38s) and
+would protect a cold / memory-pressured environment, but it adds a custom paginator
++ base-queryset rebuild + a second query and an unpredictable cold-plan, for a
+benefit that is small on production-class hardware. Not worth it for this PR.
+
+## General direction
+
+The durable fix for "COUNT is slow on huge filtered lists" is not per-filter
+bespoke counting — it is an **estimated-count paginator** (ticket #1328): use the
+PostgreSQL planner's row estimate (`EXPLAIN (FORMAT JSON)` → `Plan["Plan Rows"]`,
+<15ms, ~3% accurate where it matters) with an exact-count fallback below a
+threshold. That handles *any* filter, not just existence filters. Subtraction
+(exact, existence-filters-only) remains a possible fast path to layer underneath it
+if exactness is required. See also the annotation-strip count trick in
+`ProjectPagination.get_count` and PR #1317.
diff --git a/ui/src/components/filtering/filter-control.tsx b/ui/src/components/filtering/filter-control.tsx
index 36bf832ae..0e0944919 100644
--- a/ui/src/components/filtering/filter-control.tsx
+++ b/ui/src/components/filtering/filter-control.tsx
@@ -16,6 +16,7 @@ import { TaxaListFilter } from './filters/taxa-list-filter'
import { TaxonFilter } from './filters/taxon-filter'
import { TypeFilter } from './filters/type-filter'
import { FilterProps } from './filters/types'
+import { ProcessingStatusFilter } from './filters/processing-status-filter'
import { VerificationStatusFilter } from './filters/verification-status-filter'
import { VerifiedByFilter } from './filters/verified-by-filter'
@@ -30,6 +31,7 @@ const ComponentMap: {
deployment: StationFilter,
detections__source_image: ImageFilter,
event: SessionFilter,
+ processed: ProcessingStatusFilter,
include_unobserved: BooleanFilter,
job_type_key: TypeFilter,
not_algorithm: NotAlgorithmFilter,
diff --git a/ui/src/components/filtering/filters/processing-status-filter.tsx b/ui/src/components/filtering/filters/processing-status-filter.tsx
new file mode 100644
index 000000000..edfe33af9
--- /dev/null
+++ b/ui/src/components/filtering/filters/processing-status-filter.tsx
@@ -0,0 +1,30 @@
+import { Select } from 'nova-ui-kit'
+import { STRING, translate } from 'utils/language'
+import { booleanToString, stringToBoolean } from '../utils'
+import { FilterProps } from './types'
+
+export const ProcessingStatusFilter = ({ value, onAdd }: FilterProps) => {
+ const booleanValue = stringToBoolean(value)
+ const options = [
+ { value: true, label: translate(STRING.PROCESSED) },
+ { value: false, label: translate(STRING.NOT_PROCESSED) },
+ ]
+
+ return (
+
+
+
+
+
+ {options.map((option) => (
+
+ {option.label}
+
+ ))}
+
+
+ )
+}
diff --git a/ui/src/data-services/models/capture.ts b/ui/src/data-services/models/capture.ts
index c2f80bdec..054c6b1e7 100644
--- a/ui/src/data-services/models/capture.ts
+++ b/ui/src/data-services/models/capture.ts
@@ -96,6 +96,12 @@ export class Capture {
})
}
+ get lastProcessed(): Date | undefined {
+ return this._capture.last_processed
+ ? new Date(this._capture.last_processed)
+ : undefined
+ }
+
get deploymentId(): string | undefined {
return this._capture.deployment
? `${this._capture.deployment.id}`
diff --git a/ui/src/pages/captures/capture-columns.tsx b/ui/src/pages/captures/capture-columns.tsx
index ca27ef4d1..efb60be38 100644
--- a/ui/src/pages/captures/capture-columns.tsx
+++ b/ui/src/pages/captures/capture-columns.tsx
@@ -3,6 +3,7 @@ import { Capture } from 'data-services/models/capture'
import {
BasicTableCell,
CellTheme,
+ DateTableCell,
ImageCellTheme,
ImageTableCell,
TableColumn,
@@ -151,6 +152,12 @@ export const columns = ({
sortField: 'path',
renderCell: (item: Capture) => ,
},
+ {
+ id: 'last-processed',
+ name: translate(STRING.FIELD_LABEL_LAST_PROCESSED),
+ sortField: 'last_processed',
+ renderCell: (item: Capture) => ,
+ },
{
id: 'occurrences',
name: translate(STRING.FIELD_LABEL_OCCURRENCES),
diff --git a/ui/src/pages/captures/captures.tsx b/ui/src/pages/captures/captures.tsx
index 00f4b2558..71f2009dc 100644
--- a/ui/src/pages/captures/captures.tsx
+++ b/ui/src/pages/captures/captures.tsx
@@ -37,6 +37,7 @@ export const Captures = () => {
dimensions: true,
filename: false,
path: false,
+ 'last-processed': true,
})
const { selectedView, setSelectedView } = useSelectedView('table')
const { filters } = useFilters()
@@ -65,6 +66,7 @@ export const Captures = () => {
+
diff --git a/ui/src/utils/language.ts b/ui/src/utils/language.ts
index 5d46189aa..c07f6038c 100644
--- a/ui/src/utils/language.ts
+++ b/ui/src/utils/language.ts
@@ -115,6 +115,7 @@ export enum STRING {
FIELD_LABEL_JOBS,
FIELD_LABEL_KEY,
FIELD_LABEL_LAST_DATE,
+ FIELD_LABEL_LAST_PROCESSED,
FIELD_LABEL_LAST_SEEN,
FIELD_LABEL_LAST_SYNCED,
FIELD_LABEL_LATEST_JOB_STATUS,
@@ -308,11 +309,13 @@ export enum STRING {
MOST_OBSERVED_TAXA,
NEW_ID,
NOT_CONNECTED,
+ NOT_PROCESSED,
NOT_VERIFIED,
OR,
OVERVIEW,
PIPELINES,
PROCESS,
+ PROCESSED,
RECENT,
REJECT_ID_SHORT,
REJECT_ID,
@@ -448,6 +451,7 @@ const ENGLISH_STRINGS: { [key in STRING]: string } = {
[STRING.FIELD_LABEL_JOBS]: 'Jobs',
[STRING.FIELD_LABEL_KEY]: 'Key',
[STRING.FIELD_LABEL_LAST_DATE]: 'Last date',
+ [STRING.FIELD_LABEL_LAST_PROCESSED]: 'Last processed',
[STRING.FIELD_LABEL_LAST_SEEN]: 'Last seen',
[STRING.FIELD_LABEL_LAST_SYNCED]: 'Last synced with data source',
[STRING.FIELD_LABEL_LATEST_JOB_STATUS]: 'Latest job status',
@@ -702,10 +706,12 @@ const ENGLISH_STRINGS: { [key in STRING]: string } = {
[STRING.MOST_OBSERVED_TAXA]: 'Most observed taxa',
[STRING.NEW_ID]: 'New ID',
[STRING.NOT_CONNECTED]: 'Not connected',
+ [STRING.NOT_PROCESSED]: 'Not processed',
[STRING.NOT_VERIFIED]: 'Not verified',
[STRING.OR]: 'Or',
[STRING.OVERVIEW]: 'Overview',
[STRING.PIPELINES]: 'Pipelines',
+ [STRING.PROCESSED]: 'Processed',
[STRING.RECENT]: 'Recent',
[STRING.REJECT_ID_SHORT]: 'Reject',
[STRING.REJECT_ID]: 'Reject ID',
diff --git a/ui/src/utils/useFilters.ts b/ui/src/utils/useFilters.ts
index 656e5c6f0..b028a2592 100644
--- a/ui/src/utils/useFilters.ts
+++ b/ui/src/utils/useFilters.ts
@@ -136,6 +136,13 @@ export const AVAILABLE_FILTERS = (projectId: string): FilterConfig[] => [
},
},
},
+ {
+ label: 'Processing status',
+ field: 'processed',
+ tooltip: {
+ text: 'Filter captures by whether they have been processed by a detection pipeline.',
+ },
+ },
{
label: 'Pipeline',
field: 'pipeline',