Skip to content

Diff: architecture/mailbox-atomicity.es

From 5baf87f to 5baf87f

+31 / −0 lines
BeforeAfter
---
schema: foundry-doc-v1
title: "Mailbox Atomicity — flock-Based Prepend and msg-id Idempotency"
slug: mailbox-atomicity
language: en
category: architecture
type: topic
status: active
bcsc_class: public-disclosure-safe
last_edited: 2026-05-18
editor: pointsav-engineering
cites: []
paired_with: mailbox-atomicity.es.md
---
Sessions communicate by prepending messages to flat-file mailboxes at `.agent/inbox.md` and `.agent/outbox.md`. Messages carry a YAML envelope (from, to, re, created, status, optional msg-id) followed by a free-text body. Newest messages live at the top so a Command Session reading an outbox via `head` sees the most recent activity first.
The atomicity problem: two sessions prepending to the same mailbox without coordination produce the classic read-modify-write race. Session A reads the current file, prepends its message, and writes back. Session B does the same simultaneously. Whichever write lands last wins; the other message is lost. This is reproducible with parallel AI-coding sub-agents; cross-user concurrency makes it inevitable.
`bin/mailbox-prepend.sh` solves this with `flock`. It takes the target mailbox path, derives a lock path under `.agent/locks/` for archive-scoped mailboxes (or `/tmp/foundry-mailbox-<sha>.lock` for ad-hoc cases), and acquires an exclusive lock with a 30-second timeout before performing the read-modify-write. Two simultaneous calls serialise rather than collide.
A second safety: msg-id idempotency. If the message body contains a `msg-id:` field, the script scans the first 200 lines of the target for an existing entry with the same id and skips the prepend if found. This means a script that gets retried — a timer firing twice, an operator re-running, a hook misfiring — will not double-send. The 200-line window is generous enough to catch same-day duplicates without making the scan expensive on large archives.
This is two-line wrapping around `flock` and `grep`. The failure mode without it is silent — a lost message simply never arrives, and the sender has no indication. Building the discipline into a shared script that sessions and automation call is cheaper than building a more sophisticated queue.
## See also
- [[multi-engine-session-coordination]] — the session-lock protocol this mailbox discipline supports
- [[totebox-session]] — the session model whose inter-session communication this atomicity protects
- [[totebox-orchestration-development]] — the orchestration layer that routes messages between sessions
- [[learning-datagraph-architecture]] — the audit ledger whose capture hooks face the same concurrent-write race conditions