Skip to content

Diff: architecture/multi-engine-session-coordination.es

From a8a019c to a8a019c

+0 / −0 lines
BeforeAfter
--- ---
schema: foundry-doc-v1 schema: foundry-doc-v1
title: "Multi-engine session coordination — session locks, boot_id, and role guards" title: "Multi-engine session coordination — session locks, boot_id, and role guards"
slug: multi-engine-session-coordination slug: multi-engine-session-coordination
language: en language: en
category: architecture category: architecture
type: topic type: topic
status: active status: active
bcsc_class: public-disclosure-safe bcsc_class: public-disclosure-safe
last_edited: 2026-05-25 last_edited: 2026-05-25
editor: pointsav-engineering editor: pointsav-engineering
cites: [] cites: []
paired_with: multi-engine-session-coordination.es.md paired_with: multi-engine-session-coordination.es.md
--- ---
Totebox Orchestration supports multiple AI engines and human operators working concurrently on the same host. The coordination problem is not theoretical — when two sessions touch the same `.git/index`, the working tree corrupts in ways that are expensive to diagnose. Totebox Orchestration supports multiple AI engines and human operators working concurrently on the same host. The coordination problem is not theoretical — when two sessions touch the same `.git/index`, the working tree corrupts in ways that are expensive to diagnose.
The protocol is intentionally minimal: each engine writes `.agent/engines/<engine-id>/session.lock` at startup. The lock carries the engine identifier, session role, parent PID, ISO-8601 start time, and the boot ID from `/proc/sys/kernel/random/boot_id`. The boot ID is the key — it lets a future session decide whether a lock is stale (a different boot ID means the host rebooted between sessions, making the lock definitively dead) or potentially live (same boot ID, check `kill -0 <pid>` for process liveness). The protocol is intentionally minimal: each engine writes `.agent/engines/<engine-id>/session.lock` at startup. The lock carries the engine identifier, session role, parent PID, ISO-8601 start time, and the boot ID from `/proc/sys/kernel/random/boot_id`. The boot ID is the key — it lets a future session decide whether a lock is stale (a different boot ID means the host rebooted between sessions, making the lock definitively dead) or potentially live (same boot ID, check `kill -0 <pid>` for process liveness).
The [[totebox-session]] model assigns exactly one hub session to the workspace root. That session writes `role.lock` at `.agent/role.lock`; a second attempt errors out unless the operator manually clears the lock. Archive sessions are scoped to individual archives and write their locks under that archive's `.agent/engines/<engine-id>/session.lock`. The [[totebox-session]] model assigns exactly one hub session to the workspace root. That session writes `role.lock` at `.agent/role.lock`; a second attempt errors out unless the operator manually clears the lock. Archive sessions are scoped to individual archives and write their locks under that archive's `.agent/engines/<engine-id>/session.lock`.
What this does not solve: two engines opened in the same archive. The session-lock protocol detects the conflict and warns, but does not physically prevent it — `flock` on `.git/index` does that. A planned PreToolUse hook adds a check that refuses any write call in an archive whose `session.lock` shows a different live engine. The workspace health-check tool includes a cross-user `index.lock` detector that surfaces same-archive locks held by different operators. What this does not solve: two engines opened in the same archive. The session-lock protocol detects the conflict and warns, but does not physically prevent it — `flock` on `.git/index` does that. A planned PreToolUse hook adds a check that refuses any write call in an archive whose `session.lock` shows a different live engine. The workspace health-check tool includes a cross-user `index.lock` detector that surfaces same-archive locks held by different operators.
Stale-lock cleanup is automatic when boot IDs disagree, manual otherwise. A cleanup pass on 2026-05-18 removed 8 such locks — 3 from a previous boot, 5 from dead PIDs in the current boot. Hub sessions should run the health-check tool early and clear stale locks before opening any archive. Stale-lock cleanup is automatic when boot IDs disagree, manual otherwise. A cleanup pass on 2026-05-18 removed 8 such locks — 3 from a previous boot, 5 from dead PIDs in the current boot. Hub sessions should run the health-check tool early and clear stale locks before opening any archive.
## See also ## See also
- [[totebox-session]] — the session model whose concurrency guarantees this protocol protects - [[totebox-session]] — the session model whose concurrency guarantees this protocol protects
- [[mailbox-atomicity]] — the complementary atomic-write discipline for cross-session communication - [[mailbox-atomicity]] — the complementary atomic-write discipline for cross-session communication
- [[foundry-services-slice-model]] — the cgroup partition that isolates resource consumption in the same multi-developer environment - [[foundry-services-slice-model]] — the cgroup partition that isolates resource consumption in the same multi-developer environment
- [[totebox-orchestration-development]] — the orchestration layer these sessions operate within - [[totebox-orchestration-development]] — the orchestration layer these sessions operate within