Diff: systems/os-infrastructure-ppn-node
From 17e01f7 to 17e01f7
+1 / −0 lines
| Before | After |
|---|---|
| --- | --- |
| schema: foundry-doc-v1 | schema: foundry-doc-v1 |
| title: "os-infrastructure — PPN Node Operating System" | title: "os-infrastructure — PPN Node Operating System" |
| slug: os-infrastructure-ppn-node | slug: os-infrastructure-ppn-node |
| short_description: "os-infrastructure is the operating system layer for PointSav Private Network nodes — its sole purpose is to set up, operate, and maintain a PPN node: managing WireGuard tunnels, hosting guest virtual machines, and exposing the operator control plane." | short_description: "os-infrastructure is the operating system layer for PointSav Private Network nodes — its sole purpose is to set up, operate, and maintain a PPN node: managing WireGuard tunnels, hosting guest virtual machines, and exposing the operator control plane." |
| category: systems | category: systems |
| type: topic | type: topic |
| content_type: topic | |
| status: active | status: active |
| bcsc_class: public-disclosure-safe | bcsc_class: public-disclosure-safe |
| language: en | language: en |
| paired_with: os-infrastructure-ppn-node.es.md | paired_with: os-infrastructure-ppn-node.es.md |
| last_edited: 2026-05-30 | last_edited: 2026-05-30 |
| editor: editorial | editor: editorial |
| --- | --- |
| `os-infrastructure` is the operating system layer for PointSav Private Network nodes. It is not a general-purpose OS. Its sole purpose is to set up, operate, and maintain a node in a PointSav Private Network: managing WireGuard tunnels, hosting guest virtual machines for other platform services, and exposing the operator control plane. | `os-infrastructure` is the operating system layer for PointSav Private Network nodes. It is not a general-purpose OS. Its sole purpose is to set up, operate, and maintain a node in a PointSav Private Network: managing WireGuard tunnels, hosting guest virtual machines for other platform services, and exposing the operator control plane. |
| --- | --- |
| ## What os-infrastructure is | ## What os-infrastructure is |
| An `os-infrastructure` node is the physical or virtual host that anchors a PointSav Private Network. Every node in the mesh runs `os-infrastructure` as its host OS. The three nodes in a typical deployment — a cloud instance and two on-premises machines — each run an independent instance of `os-infrastructure`. They communicate exclusively over WireGuard tunnels; there is no shared network fabric, no cluster control plane, and no VLAN dependency. | An `os-infrastructure` node is the physical or virtual host that anchors a PointSav Private Network. Every node in the mesh runs `os-infrastructure` as its host OS. The three nodes in a typical deployment — a cloud instance and two on-premises machines — each run an independent instance of `os-infrastructure`. They communicate exclusively over WireGuard tunnels; there is no shared network fabric, no cluster control plane, and no VLAN dependency. |
| `os-infrastructure` manages: | `os-infrastructure` manages: |
| - **WireGuard tunnels** to all mesh peers | - **WireGuard tunnels** to all mesh peers |
| - **Guest virtual machines** for the platform services that process data (VM-MediaKit, VM-Totebox, VM-Orchestration, VM-PrivateGit) | - **Guest virtual machines** for the platform services that process data (VM-MediaKit, VM-Totebox, VM-Orchestration, VM-PrivateGit) |
| - **The Genesis Protocol ceremony** for adding new nodes to the mesh | - **The Genesis Protocol ceremony** for adding new nodes to the mesh |
| - **The VM resource pool** — tracking available CPU and RAM across all nodes and dispatching VM creation requests to the node with sufficient headroom | - **The VM resource pool** — tracking available CPU and RAM across all nodes and dispatching VM creation requests to the node with sufficient headroom |
| `os-infrastructure` does not store user data. Archives, media files, and database records live inside guest virtual machines managed by `os-totebox`, `os-mediakit`, or `os-privategit`. | `os-infrastructure` does not store user data. Archives, media files, and database records live inside guest virtual machines managed by `os-totebox`, `os-mediakit`, or `os-privategit`. |
| --- | --- |
| ## Phase 1 — Ubuntu 24.04 | ## Phase 1 — Ubuntu 24.04 |
| The current operational deployment runs Ubuntu 24.04 as the host OS. QEMU provides the hypervisor; on hardware with Intel VT-x or AMD-V extensions, QEMU runs KVM-accelerated guests. On GCP virtual instances where nested virtualisation is not enabled, QEMU falls back to TCG software emulation. | The current operational deployment runs Ubuntu 24.04 as the host OS. QEMU provides the hypervisor; on hardware with Intel VT-x or AMD-V extensions, QEMU runs KVM-accelerated guests. On GCP virtual instances where nested virtualisation is not enabled, QEMU falls back to TCG software emulation. |
| WireGuard is in-kernel on Ubuntu 24.04 (kernel 5.6+). The three-node mesh uses a `10.8.0.0/24` address range with Laptop B as the routing hub. | WireGuard is in-kernel on Ubuntu 24.04 (kernel 5.6+). The three-node mesh uses a `10.8.0.0/24` address range with Laptop B as the routing hub. |
| Systemd manages all platform services. Each guest VM runs as a systemd unit wrapping a QEMU process with a UNIX monitor socket for control operations. | Systemd manages all platform services. Each guest VM runs as a systemd unit wrapping a QEMU process with a UNIX monitor socket for control operations. |
| The VM resource pool in Phase 1 consists of two services. `service-vm-host` runs on each node and sends a heartbeat to the fleet controller every ten seconds, reporting available RAM and CPU load. `service-vm-fleet` runs on the GCP node and receives those heartbeats; when an operator requests a new VM through the `app-network-admin` interface, the fleet controller selects the node with the most available RAM above a safety margin and dispatches the creation request. | The VM resource pool in Phase 1 consists of two services. `service-vm-host` runs on each node and sends a heartbeat to the fleet controller every ten seconds, reporting available RAM and CPU load. `service-vm-fleet` runs on the GCP node and receives those heartbeats; when an operator requests a new VM through the `app-network-admin` interface, the fleet controller selects the node with the most available RAM above a safety margin and dispatches the creation request. |
| --- | --- |
| ## Phase 2 — NetBSD 11.0 + NVMM (planned) | ## Phase 2 — NetBSD 11.0 + NVMM (planned) |
| Phase 2 is intended to replace the Ubuntu 24.04 host with NetBSD 11.0, a BSD-licensed operating system with a stronger security posture for production PPN node operation. | Phase 2 is intended to replace the Ubuntu 24.04 host with NetBSD 11.0, a BSD-licensed operating system with a stronger security posture for production PPN node operation. |
| NetBSD 11.0 ships NVMM (NetBSD Virtual Machine Monitor), a bare-metal hypervisor mainline since NetBSD 9.0 that uses Intel VT-x EPT for hardware isolation. QEMU runs guests using the `-accel nvmm` flag. A single Phase 2 node is intended to host 128 virtual machines across 256 vCPU capacity. | NetBSD 11.0 ships NVMM (NetBSD Virtual Machine Monitor), a bare-metal hypervisor mainline since NetBSD 9.0 that uses Intel VT-x EPT for hardware isolation. QEMU runs guests using the `-accel nvmm` flag. A single Phase 2 node is intended to host 128 virtual machines across 256 vCPU capacity. |
| NetBSD 11.0 also ships in-kernel `wg(4)` WireGuard, eliminating the userspace WireGuard dependency of Phase 1. `securelevel=2` locks the running kernel against modification. VeriExec validates load-time binary integrity against a signed manifest. | NetBSD 11.0 also ships in-kernel `wg(4)` WireGuard, eliminating the userspace WireGuard dependency of Phase 1. `securelevel=2` locks the running kernel against modification. VeriExec validates load-time binary integrity against a signed manifest. |
| Planned resource target: 120 MB disk, 48 MB RAM idle. | Planned resource target: 120 MB disk, 48 MB RAM idle. |
| --- | --- |
| ## Phase 3 — seL4 + Microkit 2.x (intended) | ## Phase 3 — seL4 + Microkit 2.x (intended) |
| Phase 3 is intended to replace the NetBSD host with a formally verified microkernel built on seL4 v15 and Microkit 2.x on AArch64 hardware. | Phase 3 is intended to replace the NetBSD host with a formally verified microkernel built on seL4 v15 and Microkit 2.x on AArch64 hardware. |
| The seL4 kernel itself is 162 KiB of machine-checked binary. Its formal proof establishes intransitive non-interference: a compromised guest cannot read or write the state of any other protection domain without an explicit capability grant. This is a stronger isolation claim than any hypervisor with an unverified TCB. | The seL4 kernel itself is 162 KiB of machine-checked binary. Its formal proof establishes intransitive non-interference: a compromised guest cannot read or write the state of any other protection domain without an explicit capability grant. This is a stronger isolation claim than any hypervisor with an unverified TCB. |
| The Phase 3 `os-infrastructure` is intended to run as seven seL4 protection domains: | The Phase 3 `os-infrastructure` is intended to run as seven seL4 protection domains: |
| | Protection domain | Role | | | Protection domain | Role | |
| |---|---| | |---|---| |
| | `pd-genesis` | CPace PAKE handshake; generates Crockford base32 short-code for operator verification; reaped after the pairing ceremony completes (cap revocation) | | | `pd-genesis` | CPace PAKE handshake; generates Crockford base32 short-code for operator verification; reaped after the pairing ceremony completes (cap revocation) | |
| | `pd-ledger` | Ed25519 WORM capability ledger; append-only; signs all capability grants | | | `pd-ledger` | Ed25519 WORM capability ledger; append-only; signs all capability grants | |
| | `pd-wireguard` | BoringTun `no_std` WireGuard implementation; runs inside seL4 with no libc dependency | | | `pd-wireguard` | BoringTun `no_std` WireGuard implementation; runs inside seL4 with no libc dependency | |
| | `pd-net-driver` | NIC MMIO and IRQ handling; provides the network capability to `pd-wireguard` | | | `pd-net-driver` | NIC MMIO and IRQ handling; provides the network capability to `pd-wireguard` | |
| | `pd-vmm` | Guest VM monitor using `libsel4vm`; manages VMs for other os-* types | | | `pd-vmm` | Guest VM monitor using `libsel4vm`; manages VMs for other os-* types | |
| | `pd-fleet` | Heartbeat client to the resource pool fleet controller | | | `pd-fleet` | Heartbeat client to the resource pool fleet controller | |
| | `pd-network-admin` | F8 TUI surface; receives UDP signed broadcasts; F12-gated configuration commits | | | `pd-network-admin` | F8 TUI surface; receives UDP signed broadcasts; F12-gated configuration commits | |
| `pd-genesis` is reaped after the node-join ceremony is complete. The capability it held during the ceremony is revoked and cannot be reconstructed — there is no back-door to the pairing flow after it closes. | `pd-genesis` is reaped after the node-join ceremony is complete. The capability it held during the ceremony is revoked and cannot be reconstructed — there is no back-door to the pairing flow after it closes. |
| Phase 3 requires AArch64 hardware. Microkit 2.x includes an `x86_64_generic_vtx` target, but x86-64 Microkit restricts each guest to one vCPU and requires Intel VT-x. For production deployments with multi-vCPU guests, AArch64 is the intended platform. | Phase 3 requires AArch64 hardware. Microkit 2.x includes an `x86_64_generic_vtx` target, but x86-64 Microkit restricts each guest to one vCPU and requires Intel VT-x. For production deployments with multi-vCPU guests, AArch64 is the intended platform. |
| Intended resource target: 8 MB disk, 12 MB RAM idle. | Intended resource target: 8 MB disk, 12 MB RAM idle. |
| --- | --- |
| ## Genesis Protocol | ## Genesis Protocol |
| The Genesis Protocol is the node-join ceremony that adds a new node to the mesh. | The Genesis Protocol is the node-join ceremony that adds a new node to the mesh. |
| An operator initiating a new node starts `service-ppn-pairing` on the node being added. The service performs a CPace PAKE handshake and presents a Crockford base32 short-code on the console — typically six to ten characters. The operator reads this code and enters it in the `app-network-admin` F11 approval panel on the administering machine. | An operator initiating a new node starts `service-ppn-pairing` on the node being added. The service performs a CPace PAKE handshake and presents a Crockford base32 short-code on the console — typically six to ten characters. The operator reads this code and enters it in the `app-network-admin` F11 approval panel on the administering machine. |
| Once the codes match, the pairing establishes mutual WireGuard peer records on both nodes, adds an entry to `nodes.jsonl` in the capability ledger, and terminates `service-ppn-pairing`. The ceremony window is 600 seconds; if the operator does not complete approval within that window, the code expires and the ceremony must restart from the beginning. | Once the codes match, the pairing establishes mutual WireGuard peer records on both nodes, adds an entry to `nodes.jsonl` in the capability ledger, and terminates `service-ppn-pairing`. The ceremony window is 600 seconds; if the operator does not complete approval within that window, the code expires and the ceremony must restart from the beginning. |
| No keys are transmitted over the network. The short-code comparison is the sole authentication mechanism — the operator is the root of trust, not a certificate authority. | No keys are transmitted over the network. The short-code comparison is the sole authentication mechanism — the operator is the root of trust, not a certificate authority. |
| --- | --- |
| ## Resource targets | ## Resource targets |
| | Phase | Disk | RAM idle | RAM loaded | | | Phase | Disk | RAM idle | RAM loaded | |
| |---|---|---|---| | |---|---|---|---| |
| | Phase 1 (Ubuntu 24.04, now) | ~1.5 GB (OS + services) | ~400 MB | ~800 MB | | | Phase 1 (Ubuntu 24.04, now) | ~1.5 GB (OS + services) | ~400 MB | ~800 MB | |
| | Phase 2 (NetBSD/NVMM, planned) | 120 MB | 48 MB | 200 MB | | | Phase 2 (NetBSD/NVMM, planned) | 120 MB | 48 MB | 200 MB | |
| | Phase 3 (seL4+Microkit, intended) | **8 MB** | **12 MB** | 48 MB | | | Phase 3 (seL4+Microkit, intended) | **8 MB** | **12 MB** | 48 MB | |
| Phase 3 targets are intended to make any operator's spare hardware outperform cloud provider minimum VM tiers: AWS Lambda minimum is 128 MB RAM; the Phase 3 target is 12 MB at idle — more than ten times lighter. | Phase 3 targets are intended to make any operator's spare hardware outperform cloud provider minimum VM tiers: AWS Lambda minimum is 128 MB RAM; the Phase 3 target is 12 MB at idle — more than ten times lighter. |
| ## See also | ## See also |
| - [[vm-architecture]] — the five VM types and how os-infrastructure hosts them | - [[vm-architecture]] — the five VM types and how os-infrastructure hosts them |
| - [[ppn-architecture-overview]] — four-layer PPN overview; os-infrastructure is the hypervisor layer | - [[ppn-architecture-overview]] — four-layer PPN overview; os-infrastructure is the hypervisor layer |
| - [[genesis-protocol]] — full description of the node-join ceremony | - [[genesis-protocol]] — full description of the node-join ceremony |
| - [[ppn-hypervisor-resource-pool]] — per-node virtio_balloon + vCPU resource pool management | - [[ppn-hypervisor-resource-pool]] — per-node virtio_balloon + vCPU resource pool management |