As in the title - upsert action can potentially create duplicates:
-
POST /collections/{id}/bulk_items with body: { "method": "upsert", "items": {...} }
-
When exist_ok=True (UPSERT):
- No duplicate check is performed
- Target index is determined by the new datetime
- If item already exists in old index (with different datetime), it stays there
- Item now exists in TWO indexes
Similair situation happens for PUT /items/{id} (update_item)
Proposed solutions
A) Validate conflict and throw an error 409 if item points at a different index then it would after update
B) Create smart upsert/put that checks indexes for unique key (item_id, collection_id) and if present does DELETE on the duplicate and inserts the new one in the right index
A is short and safe but breaks how upsert works (upsert that should normally call update would now throw an error - but we dont create a duplicate)
B is complicated and risky - instead of standard insert/update we would now have to call DELETE in the process risking deletion on an upsert call
As in the title - upsert action can potentially create duplicates:
POST /collections/{id}/bulk_items with body: { "method": "upsert", "items": {...} }When exist_ok=True (UPSERT):
Similair situation happens for
PUT /items/{id} (update_item)Proposed solutions
A) Validate conflict and throw an error 409 if item points at a different index then it would after update
B) Create smart upsert/put that checks indexes for unique key
(item_id, collection_id)and if present does DELETE on the duplicate and inserts the new one in the right indexA is short and safe but breaks how upsert works (upsert that should normally call update would now throw an error - but we dont create a duplicate)
B is complicated and risky - instead of standard insert/update we would now have to call DELETE in the process risking deletion on an upsert call