Diff: applications/app-console-slm.es
From 1c02ec1 to 1c02ec1
+0 / −0 lines
| Before | After |
|---|---|
| --- | --- |
| schema: foundry-doc-v1 | schema: foundry-doc-v1 |
| title: "app-console-slm — inference infrastructure monitoring console" | title: "app-console-slm — inference infrastructure monitoring console" |
| slug: app-console-slm | slug: app-console-slm |
| category: applications | category: applications |
| type: app | type: app |
| content_type: topic | content_type: topic |
| quality: complete | quality: complete |
| short_description: "A terminal user interface cartridge for the operator console that displays the live state of the AI inference infrastructure — model health, remote GPU node status, priority queue depth, organizational graph entity count, and daily spend — with keyboard controls for routing policy and per-tier kill switches." | short_description: "A terminal user interface cartridge for the operator console that displays the live state of the AI inference infrastructure — model health, remote GPU node status, priority queue depth, organizational graph entity count, and daily spend — with keyboard controls for routing policy and per-tier kill switches." |
| status: active | status: active |
| bcsc_class: public-disclosure-safe | bcsc_class: public-disclosure-safe |
| last_edited: 2026-06-09 | last_edited: 2026-06-09 |
| editor: pointsav-engineering | editor: pointsav-engineering |
| cites: [] | cites: [] |
| references: [] | references: [] |
| paired_with: app-console-slm.es.md | paired_with: app-console-slm.es.md |
| --- | --- |
| # app-console-slm — inference infrastructure monitoring console | # app-console-slm — inference infrastructure monitoring console |
| app-console-slm is a terminal user interface (TUI) cartridge for the operator console | app-console-slm is a terminal user interface (TUI) cartridge for the operator console |
| that displays the live state of the AI inference infrastructure. It shows the health | that displays the live state of the AI inference infrastructure. It shows the health |
| of the local inference model, the status of remote GPU nodes, the depth of the | of the local inference model, the status of remote GPU nodes, the depth of the |
| priority queue, the organizational graph entity count, and the current day's spending. | priority queue, the organizational graph entity count, and the current day's spending. |
| It provides keyboard controls for adjusting the routing policy and toggling per-tier | It provides keyboard controls for adjusting the routing policy and toggling per-tier |
| kill switches. | kill switches. |
| The console runs in a terminal window on the same node as the inference gateway. It | The console runs in a terminal window on the same node as the inference gateway. It |
| requires no browser, no network connection to an external service, and no | requires no browser, no network connection to an external service, and no |
| authentication beyond local shell access. It is the operator's primary dashboard for | authentication beyond local shell access. It is the operator's primary dashboard for |
| understanding and controlling the inference layer. | understanding and controlling the inference layer. |
| ## Display panels | ## Display panels |
| The console organizes information into five panels that refresh automatically every | The console organizes information into five panels that refresh automatically every |
| ten seconds. The operator can trigger an immediate refresh at any time with the R key. | ten seconds. The operator can trigger an immediate refresh at any time with the R key. |
| ### Gateway panel | ### Gateway panel |
| The gateway panel shows the current state of the inference router: whether it is | The gateway panel shows the current state of the inference router: whether it is |
| running, the active routing policy (balanced, drain-batch, drain-express, or | running, the active routing policy (balanced, drain-batch, drain-express, or |
| local-only), and the availability of each tier. A green indicator marks a tier as | local-only), and the availability of each tier. A green indicator marks a tier as |
| available. A yellow indicator marks a tier as degraded — available but with recent | available. A yellow indicator marks a tier as degraded — available but with recent |
| failures. A grey indicator marks a tier as offline. | failures. A grey indicator marks a tier as offline. |
| The gateway panel also shows the active routing policy and, if a tier's kill switch | The gateway panel also shows the active routing policy and, if a tier's kill switch |
| is closed, an explicit "kill: CLOSED" label. | is closed, an explicit "kill: CLOSED" label. |
| ### GPU node fleet panel | ### GPU node fleet panel |
| The fleet panel shows each configured remote GPU node with its current state. States | The fleet panel shows each configured remote GPU node with its current state. States |
| are: stopped (VM is off, no billing), starting (VM is booting, billing has begun), | are: stopped (VM is off, no billing), starting (VM is booting, billing has begun), |
| available (VM is ready and healthy), failed (VM failed to start or become healthy), | available (VM is ready and healthy), failed (VM failed to start or become healthy), |
| and zombie (VM is running but unresponsive). For available nodes, the panel shows the | and zombie (VM is running but unresponsive). For available nodes, the panel shows the |
| most recent probe latency in milliseconds. | most recent probe latency in milliseconds. |
| Each node has an independent kill switch. The K key opens a dialog to toggle the | Each node has an independent kill switch. The K key opens a dialog to toggle the |
| kill switch for any node, or to close all switches globally. | kill switch for any node, or to close all switches globally. |
| ### Organizational graph panel | ### Organizational graph panel |
| The graph panel shows the total entity count in the organizational knowledge graph, | The graph panel shows the total entity count in the organizational knowledge graph, |
| the number of distinct edge types present, and the timestamp of the most recent | the number of distinct edge types present, and the timestamp of the most recent |
| successful extraction. It shows the circuit breaker state for the graph service: | successful extraction. It shows the circuit breaker state for the graph service: |
| if the inference router's graph query path has experienced repeated failures and | if the inference router's graph query path has experienced repeated failures and |
| opened its circuit, the panel displays the time elapsed since the circuit opened. | opened its circuit, the panel displays the time elapsed since the circuit opened. |
| ### Queue panel | ### Queue panel |
| The queue panel shows the current depth of each priority queue level. P0 holds | The queue panel shows the current depth of each priority queue level. P0 holds |
| background classification tasks. P1 holds extraction tasks awaiting a GPU node. | background classification tasks. P1 holds extraction tasks awaiting a GPU node. |
| P2 holds training corpus generation and apprenticeship work. The panel also shows | P2 holds training corpus generation and apprenticeship work. The panel also shows |
| the total completed and the current poison count — tasks that have failed the | the total completed and the current poison count — tasks that have failed the |
| maximum number of retry attempts and require operator review. | maximum number of retry attempts and require operator review. |
| ### Cost panel | ### Cost panel |
| The cost panel shows the current day's spending across all tiers in the deployment's | The cost panel shows the current day's spending across all tiers in the deployment's |
| configured currency. The panel breaks down spending by node label: the batch node, | configured currency. The panel breaks down spending by node label: the batch node, |
| the express node, and the external API (if configured). This gives the operator | the express node, and the external API (if configured). This gives the operator |
| immediate visibility into whether a scheduled nightly drain has concluded and at | immediate visibility into whether a scheduled nightly drain has concluded and at |
| what cost. | what cost. |
| ## Keyboard controls | ## Keyboard controls |
| | Key | Action | | | Key | Action | |
| |---|---| | |---|---| |
| | R | Immediate refresh — re-queries all status endpoints | | | R | Immediate refresh — re-queries all status endpoints | |
| | K | Kill switch dialog — toggle per-tier or global kill switch | | | K | Kill switch dialog — toggle per-tier or global kill switch | |
| | P | Policy dialog — select routing policy (balanced / drain-batch / drain-express / local-only) | | | P | Policy dialog — select routing policy (balanced / drain-batch / drain-express / local-only) | |
| | G | Graph detail — show entity type breakdown and recent extraction activity | | | G | Graph detail — show entity type breakdown and recent extraction activity | |
| | ? | Help overlay — show all keybindings | | | ? | Help overlay — show all keybindings | |
| | Q | Quit | | | Q | Quit | |
| ## Technical characteristics | ## Technical characteristics |
| The console is a library crate that implements the Cartridge trait for the operator | The console is a library crate that implements the Cartridge trait for the operator |
| console chassis. It loads at slot F9. Communication with the inference gateway uses | console chassis. It loads at slot F9. Communication with the inference gateway uses |
| standard HTTP against the gateway's monitoring endpoints; no special protocol is | standard HTTP against the gateway's monitoring endpoints; no special protocol is |
| required. The console performs only read operations by default; write operations | required. The console performs only read operations by default; write operations |
| (kill switch toggles, policy changes) require explicit keyboard confirmation. | (kill switch toggles, policy changes) require explicit keyboard confirmation. |
| The console uses a background polling task that fetches status data every ten seconds | The console uses a background polling task that fetches status data every ten seconds |
| and sends it to the rendering task via a channel. The rendering task does not block | and sends it to the rendering task via a channel. The rendering task does not block |
| on network requests; it displays whatever data arrived most recently. This design | on network requests; it displays whatever data arrived most recently. This design |
| ensures the console remains responsive even when the gateway is slow to respond. | ensures the console remains responsive even when the gateway is slow to respond. |
| The display degrades gracefully when individual status endpoints are unavailable. | The display degrades gracefully when individual status endpoints are unavailable. |
| Missing panels show a "— unavailable —" indicator rather than preventing the console | Missing panels show a "— unavailable —" indicator rather than preventing the console |
| from rendering. | from rendering. |
| Plain-text mode is available via the `--plain` flag for terminal environments without | Plain-text mode is available via the `--plain` flag for terminal environments without |
| unicode support. Unicode status symbols are replaced with ASCII equivalents. | unicode support. Unicode status symbols are replaced with ASCII equivalents. |
| ## Relationship to the inference gateway | ## Relationship to the inference gateway |
| The console is a read-mostly observer of the inference gateway. It does not participate | The console is a read-mostly observer of the inference gateway. It does not participate |
| in routing decisions. Kill switch and policy commands sent through the console take | in routing decisions. Kill switch and policy commands sent through the console take |
| effect immediately in the gateway, but the console does not verify the effect beyond | effect immediately in the gateway, but the console does not verify the effect beyond |
| showing the updated state on the next refresh cycle. | showing the updated state on the next refresh cycle. |
| The console is deployed alongside the inference gateway on the same node. It does not | The console is deployed alongside the inference gateway on the same node. It does not |
| require network connectivity to external services to function. If the inference gateway | require network connectivity to external services to function. If the inference gateway |
| is unreachable, the console continues running and shows all panels as unavailable. | is unreachable, the console continues running and shows all panels as unavailable. |