Skip to content

set -> remove -> re-add plus an unrelated delete corrupts entity rows: get() returns another entity's data, queries yield dead entities #320

@Pyseph

Description

@Pyseph

After a completely ordinary sequence — set a component, remove it, set it again, then delete a different entity — world:get starts returning another entity's component data, and world:query yields entities that were already deleted (world:contains is false for them). No error is raised; the world is silently corrupted from that point on, and the corruption compounds with further deletes.

Affects: current main (3087601). Found during a systematic audit (context: #317); reproduces identically with and without that PR's changes.

Root cause

When world_remove (src/jecs.luau:3435) strips an entity's last component, archetype_traverse_remove ends in archetype_ensure, which returns ROOT_ARCHETYPE for an empty type list (:1071-1074). inner_entity_move (:2833) then appends the entity to ROOT_ARCHETYPE.entities via archetype_append (:719) and points record.row at that root slot. So far so good — the root archetype now legitimately tracks the entity.

But when a component is added back, world_set (:2973-2980) and world_add (:3072-3080) treat src == ROOT_ARCHETYPE as "fresh entity":

if not src_is_root_archetype then
    inner_entity_move(entity, record, to)
else
    new_entity(entity, record, to)
end

new_entity (:729-738) only appends to the destination archetype; unlike inner_entity_move, it never swap-removes the entity's old slot from ROOT_ARCHETYPE.entities. The entity is now listed in two archetypes, and the root entry is stale.

The stale entry detonates on the next swap-remove in the root archetype: world_delete (:3456) calls archetype_delete(world, record.archetype, record.row) (:3474) even when the record's archetype is ROOT, and archetype_delete (:1199-1224) moves the last entry of ROOT_ARCHETYPE.entities — the stale duplicate — into the freed row and writes record_to_move.row = row (:1209-1212). That clobbers the live entity's row, which actually indexes into its real archetype. From then on two records share one row: get/query return the other entity's data, and deleting either of the aliased entities swap-removes the wrong slot, leaving a dead entity id inside the archetype's entities array that queries keep yielding.

Repro

Save as repro.luau in the repo root, run luau repro.luau:

local jecs = require("@jecs")

local world = jecs.world()
local A = world:component()

-- e2: set then remove the last component -> e2 is appended to the root archetype's entities (row 1)
local e2 = world:entity()
world:set(e2, A, "e2-data")
world:remove(e2, A)

-- e1: same dance -> root archetype entities row 2
local e1 = world:entity()
world:set(e1, A, "e1-data")
world:remove(e1, A)

-- e3: occupies archetype [A] row 1
local e3 = world:entity()
world:set(e3, A, "e3-data")

-- re-add A to e1: takes the "fresh entity" path and leaves a stale e1 entry in the root archetype
world:set(e1, A, "e1-data")

print("before: get(e1) =", world:get(e1, A), "| get(e3) =", world:get(e3, A))

-- delete the unrelated, now-empty entity e2
world:delete(e2)

print("after delete(e2): get(e1) =", world:get(e1, A), "| get(e3) =", world:get(e3, A))

local seen = {}
for e, v in world:query(A) do
	table.insert(seen, `{e}={v}`)
end
print("query(A):", table.concat(seen, ", "), "| e1 id:", e1, "| e3 id:", e3)

-- compounding corruption: delete e1
world:delete(e1)
local seen2 = {}
for e, v in world:query(A) do
	table.insert(seen2, `{e}={v} alive={world:contains(e)}`)
end
print("after delete(e1): query(A):", table.concat(seen2, ", "))
print("get(e3) =", world:get(e3, A), "(expected e3-data)")

Observed

before: get(e1) =	e1-data	| get(e3) =	e3-data
after delete(e2): get(e1) =	e3-data	| get(e3) =	e3-data
query(A):	273=e1-data, 274=e3-data	| e1 id:	273	| e3 id:	274
after delete(e1): query(A):	273=e1-data alive=false
get(e3) =	e1-data	(expected e3-data)

Deleting e2 — an empty, unrelated entity — makes get(e1, A) return e3's data. After delete(e1), query(A) yields the dead entity 273 (contains = false) while the live e3 (274) has vanished from the query and now reads back e1's old data.

Expected

delete(e2) must not affect e1 or e3: get(e1, A) should keep returning "e1-data", get(e3, A) should always return "e3-data", and query(A) should only ever yield alive entities paired with their own data.

Impact

The trigger is everyday usage: toggle a component off and back on (set → remove → set), then delete any other entity that happens to be sitting component-less in the root archetype — no exotic API calls involved. The result is silent cross-entity data aliasing and dead entity ids flowing out of queries, so a running game reads another entity's state (e.g. one player's data attributed to another) or operates on despawned entities, with no error at the point of corruption to trace back from. Severity is critical because the world's core invariant (one record ↔ one row) is permanently broken once it happens.


This bug was found, diagnosed, and reproduced by Claude (Fable 5, via Claude Code) under my direction, as part of the same audited setup as #317. The output above is from a real run against main.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions