Skip to content

Diff: services/service-slm.es

From 38aa424 to 38aa424

+0 / −0 lines
BeforeAfter
--- ---
schema: foundry-doc-v1 schema: foundry-doc-v1
title: "service-slm: Linguistic Air-Lock" title: "service-slm: Linguistic Air-Lock"
slug: service-slm slug: service-slm
category: services category: services
type: topic type: topic
quality: complete quality: complete
short_description: "service-slm is the Doorman service — the single network boundary that holds API keys, routes AI inference across three compute tiers, and writes an immutable per-tenant audit record of every external call." short_description: "service-slm is the Doorman service — the single network boundary that holds API keys, routes AI inference across three compute tiers, and writes an immutable per-tenant audit record of every external call."
status: active status: active
bcsc_class: public-disclosure-safe bcsc_class: public-disclosure-safe
last_edited: 2026-04-30 last_edited: 2026-04-30
editor: pointsav-engineering editor: pointsav-engineering
cites: [] cites: []
paired_with: service-slm.es.md paired_with: service-slm.es.md
--- ---
Every AI inference call on the PointSav platform routes through a single service — **service-slm** (the Doorman) — which holds all provider API keys, selects the cheapest compute tier that meets the request's deadline, and writes an immutable audit record before returning the result. A request that resolves on the local model never leaves the customer's infrastructure and never appears on a cloud billing statement. The routing logic, tier thresholds, and audit log are all operator-controlled. service-slm is the Ring 3 optional-intelligence boundary: it holds API keys for external AI providers, enforces sanitise-outbound and rehydrate-inbound discipline so that customer-identifying details do not reach external providers in raw form, and writes a signed audit entry to the per-tenant ledger on every call. Every AI inference call on the PointSav platform routes through a single service — **service-slm** (the Doorman) — which holds all provider API keys, selects the cheapest compute tier that meets the request's deadline, and writes an immutable audit record before returning the result. A request that resolves on the local model never leaves the customer's infrastructure and never appears on a cloud billing statement. The routing logic, tier thresholds, and audit log are all operator-controlled. service-slm is the Ring 3 optional-intelligence boundary: it holds API keys for external AI providers, enforces sanitise-outbound and rehydrate-inbound discipline so that customer-identifying details do not reach external providers in raw form, and writes a signed audit entry to the per-tenant ledger on every call.
## Architectural Baseline ## Architectural Baseline
The Doorman is the platform's sole AI boundary — no inference call enters or exits the knowledge pipeline without passing through it. The Doorman acts as the air-lock for unstructured text entering the knowledge pipeline. Raw text arriving from Ring 1 services — emails, PDFs, form submissions — passes through service-slm before any structured facts are written to the knowledge graph. The service applies a Small Language Model to extract verifiable facts, formats them as clean Markdown, and closes the AI processing window before data continues downstream to Ring 2. This containment is the implementation of SYS-ADR-07: structured data never routes through AI. The Doorman is the platform's sole AI boundary — no inference call enters or exits the knowledge pipeline without passing through it. The Doorman acts as the air-lock for unstructured text entering the knowledge pipeline. Raw text arriving from Ring 1 services — emails, PDFs, form submissions — passes through service-slm before any structured facts are written to the knowledge graph. The service applies a Small Language Model to extract verifiable facts, formats them as clean Markdown, and closes the AI processing window before data continues downstream to Ring 2. This containment is the implementation of SYS-ADR-07: structured data never routes through AI.
## Ring and Role ## Ring and Role
service-slm occupies **Ring 3 — Optional Intelligence** in the three-ring compounding-substrate architecture. Ring 3 is structurally optional: Ring 1 boundary ingest and Ring 2 knowledge and processing operate without it. When Ring 3 is active, the Doorman mediates all AI calls on behalf of any ring, applying per-tenant configuration and routing policy. service-slm occupies **Ring 3 — Optional Intelligence** in the three-ring compounding-substrate architecture. Ring 3 is structurally optional: Ring 1 boundary ingest and Ring 2 knowledge and processing operate without it. When Ring 3 is active, the Doorman mediates all AI calls on behalf of any ring, applying per-tenant configuration and routing policy.
The three compute tiers the Doorman routes across: The three compute tiers the Doorman routes across:
| Tier | Compute | When used | | Tier | Compute | When used |
|---|---|---| |---|---|---|
| Tier A — Local | OLMo 3 7B Q4 on workspace VM (CPU) | High-volume, low-latency, budget-sensitive requests | | Tier A — Local | OLMo 3 7B Q4 on workspace VM (CPU) | High-volume, low-latency, budget-sensitive requests |
| Tier B — On-demand GPU | OLMo 3.1 32B Think on multi-cloud GPU burst (GCP Cloud Run / RunPod / Modal) | Requests that require larger model capacity | | Tier B — On-demand GPU | OLMo 3.1 32B Think on multi-cloud GPU burst (GCP Cloud Run / RunPod / Modal) | Requests that require larger model capacity |
| Tier C — External API | Anthropic Claude / Google Gemini / OpenAI | Narrow precision tasks: citation grounding, initial graph build | | Tier C — External API | Anthropic Claude / Google Gemini / OpenAI | Narrow precision tasks: citation grounding, initial graph build |
Customers do not choose the tier. Request shape and budget caps determine routing automatically. Customers do not choose the tier. Request shape and budget caps determine routing automatically.
## Structural Organization of Components ## Structural Organization of Components
The Doorman enforces three invariants at every call boundary: The Doorman enforces three invariants at every call boundary:
1. **Key custody.** No other service holds provider API keys. The Doorman is the exclusive key holder. A compromise of any Ring 1 or Ring 2 service does not expose provider keys. 1. **Key custody.** No other service holds provider API keys. The Doorman is the exclusive key holder. A compromise of any Ring 1 or Ring 2 service does not expose provider keys.
2. **Audit logging.** Every call — including Tier A local calls — writes a signed record to the per-tenant audit ledger at `~/Foundry/data/audit-ledger/<tenant>/<YYYY-MM>.jsonl`. The record captures the request shape, tier selected, token count, and response hash. 2. **Audit logging.** Every call — including Tier A local calls — writes a signed record to the per-tenant audit ledger at `~/Foundry/data/audit-ledger/<tenant>/<YYYY-MM>.jsonl`. The record captures the request shape, tier selected, token count, and response hash.
3. **Sanitise-outbound / rehydrate-inbound.** Before any request leaves the workspace, the Doorman strips customer-identifying details per the per-tenant sanitisation policy. The response is rehydrated with the stripped context before returning to the caller. 3. **Sanitise-outbound / rehydrate-inbound.** Before any request leaves the workspace, the Doorman strips customer-identifying details per the per-tenant sanitisation policy. The response is rehydrated with the stripped context before returning to the caller.
The service listens on `127.0.0.1:9080` and speaks an OpenAI-compatible HTTP API, allowing any Ring 2 service to address it without bespoke integration. The service listens on `127.0.0.1:9080` and speaks an OpenAI-compatible HTTP API, allowing any Ring 2 service to address it without bespoke integration.
## Configuration ## Configuration
The Doorman is deployed as a systemd unit (`infrastructure/local-doorman/`) on the workspace VM. Key configuration fields: The Doorman is deployed as a systemd unit (`infrastructure/local-doorman/`) on the workspace VM. Key configuration fields:
- Per-tenant budget caps (monthly token limits per tier) - Per-tenant budget caps (monthly token limits per tier)
- Sanitisation policy (field list stripped before outbound calls) - Sanitisation policy (field list stripped before outbound calls)
- Routing thresholds (conditions under which a request escalates from Tier A to Tier B or C) - Routing thresholds (conditions under which a request escalates from Tier A to Tier B or C)
- Audit ledger path and rotation schedule - Audit ledger path and rotation schedule
## See Also ## See Also
- [[service-extraction]] - [[service-extraction]]
- [[service-search]] - [[service-search]]
- [[apprenticeship-substrate]] - [[apprenticeship-substrate]]
- [[language-protocol-substrate]] - [[language-protocol-substrate]]
- [[trajectory-substrate]] - [[trajectory-substrate]]
## References ## References
- §XI — Three-ring architecture and three-tier compute routing - §XI — Three-ring architecture and three-tier compute routing
- `infrastructure/local-doorman/` — systemd unit (live since workspace v0.1.13) - `infrastructure/local-doorman/` — systemd unit (live since workspace v0.1.13)
- SYS-ADR-07 — structured data never routes through AI - SYS-ADR-07 — structured data never routes through AI