Diff: how-to/run-local-slm-inference

From 23d0958 to 23d0958

+0 / −0 lines

Before	After
---	---
schema: foundry-doc-v1	schema: foundry-doc-v1
title: "How to run local SLM inference"	title: "How to run local SLM inference"
slug: run-local-slm-inference	slug: run-local-slm-inference
category: how-to	category: how-to
content_type: how-to	content_type: how-to
type: how-to	type: how-to
status: stub	status: stub
last_edited: 2026-06-14	last_edited: 2026-06-14
editor: pointsav-engineering	editor: pointsav-engineering
paired_with: run-local-slm-inference.es.md	paired_with: run-local-slm-inference.es.md
---	---

The PointSav inference stack runs a small language model locally via the Doorman gateway. All inference stays on the operator's hardware — no prompt data is sent to an external provider. This guide covers starting the local SLM service, verifying the Doorman health endpoint, and submitting your first inference request from os-console.	The PointSav inference stack runs a small language model locally via the Doorman gateway. All inference stays on the operator's hardware — no prompt data is sent to an external provider. This guide covers starting the local SLM service, verifying the Doorman health endpoint, and submitting your first inference request from os-console.

For the inference stack architecture, see [[slm-stack-architecture]] and [[doorman-protocol]]. For the console cartridge that surfaces local inference in the TUI, see [[app-console-slm]].	For the inference stack architecture, see [[slm-stack-architecture]] and [[doorman-protocol]]. For the console cartridge that surfaces local inference in the TUI, see [[app-console-slm]].

## See also	## See also

- [[slm-stack-architecture]] — architecture of the local SLM stack and supported model tiers	- [[slm-stack-architecture]] — architecture of the local SLM stack and supported model tiers
- [[doorman-protocol]] — the Doorman gateway protocol; health, routing, and circuit-breaker behaviour	- [[doorman-protocol]] — the Doorman gateway protocol; health, routing, and circuit-breaker behaviour
- [[app-console-slm]] — the os-console SLM cartridge and the Doorman health dashboard	- [[app-console-slm]] — the os-console SLM cartridge and the Doorman health dashboard
- [[open-first-totebox-session]] — the operator context in which inference is available	- [[open-first-totebox-session]] — the operator context in which inference is available
- [[self-host-a-deployment]] — provision the instance that hosts the inference stack	- [[self-host-a-deployment]] — provision the instance that hosts the inference stack