Skip to content

Diff: services/service-extraction.es

From e6d5f15 to e6d5f15

+0 / −0 lines
BeforeAfter
--- ---
schema: foundry-doc-v1 schema: foundry-doc-v1
title: "service-extraction: Deterministic Parser" title: "service-extraction: Deterministic Parser"
slug: service-extraction slug: service-extraction
category: services category: services
type: topic type: topic
quality: complete quality: complete
short_description: "service-extraction is the Ring 2 central traffic controller that strips proprietary formatting from raw payloads, constructs structured Entity Bundles, assigns transaction IDs, and routes data to deterministic services or to service-slm for AI-assisted extraction." short_description: "service-extraction is the Ring 2 central traffic controller that strips proprietary formatting from raw payloads, constructs structured Entity Bundles, assigns transaction IDs, and routes data to deterministic services or to service-slm for AI-assisted extraction."
status: active status: active
bcsc_class: public-disclosure-safe bcsc_class: public-disclosure-safe
last_edited: 2026-04-30 last_edited: 2026-04-30
editor: pointsav-engineering editor: pointsav-engineering
cites: [] cites: []
paired_with: service-extraction.es.md paired_with: service-extraction.es.md
--- ---
> service-extraction is the Ring 2 central traffic controller that strips proprietary formatting from raw payloads, constructs structured Entity Bundles, assigns transaction IDs, and routes data to deterministic services or to service-slm for AI-assisted extraction. > service-extraction is the Ring 2 central traffic controller that strips proprietary formatting from raw payloads, constructs structured Entity Bundles, assigns transaction IDs, and routes data to deterministic services or to service-slm for AI-assisted extraction.
**service-extraction** is a Ring 2 knowledge-and-processing service in the PointSav three-ring architecture. It receives raw payloads from Ring 1 ingest services, strips proprietary third-party formatting (JSON, MIME, Base64), and constructs machine-readable Entity Bundles — self-contained directory structures that hold both the text payload and any associated binary attachments. It is the canonical successor to the legacy working name `service-parser`. **service-extraction** is a Ring 2 knowledge-and-processing service in the PointSav three-ring architecture. It receives raw payloads from Ring 1 ingest services, strips proprietary third-party formatting (JSON, MIME, Base64), and constructs machine-readable Entity Bundles — self-contained directory structures that hold both the text payload and any associated binary attachments. It is the canonical successor to the legacy working name `service-parser`.
## Architectural Baseline ## Architectural Baseline
Every message that passes through Ring 1 arrives at service-extraction as an unprocessed, vendor-formatted payload. The service has one responsibility: transform that payload into a clean, traceable Entity Bundle and route it to the correct downstream service. It assigns a transaction ID to each bundle, providing a chain-of-custody reference that persists through every subsequent processing step. Every message that passes through Ring 1 arrives at service-extraction as an unprocessed, vendor-formatted payload. The service has one responsibility: transform that payload into a clean, traceable Entity Bundle and route it to the correct downstream service. It assigns a transaction ID to each bundle, providing a chain-of-custody reference that persists through every subsequent processing step.
## Ring and Role ## Ring and Role
service-extraction occupies **Ring 2 — Knowledge and Processing** in the three-ring architecture. Ring 2 is multi-tenant (via `moduleId` namespacing) and deterministic: it processes data without invoking AI inference unless the data shape requires it. When a payload contains unstructured text that cannot be classified by deterministic rules, service-extraction routes that text to Ring 3 (service-slm) for AI-assisted extraction. Structured and semi-structured payloads route entirely within Ring 2. service-extraction occupies **Ring 2 — Knowledge and Processing** in the three-ring architecture. Ring 2 is multi-tenant (via `moduleId` namespacing) and deterministic: it processes data without invoking AI inference unless the data shape requires it. When a payload contains unstructured text that cannot be classified by deterministic rules, service-extraction routes that text to Ring 3 (service-slm) for AI-assisted extraction. Structured and semi-structured payloads route entirely within Ring 2.
## Structural Organization of Components ## Structural Organization of Components
### The Entity Bundle ### The Entity Bundle
Each payload is isolated into a Unix directory named by its timestamp and routing ID. The bundle contains: Each payload is isolated into a Unix directory named by its timestamp and routing ID. The bundle contains:
- **`payload.txt`** — the core text payload reduced to plain-text frontmatter format. This file is the permanent, human-readable record of the communication. - **`payload.txt`** — the core text payload reduced to plain-text frontmatter format. This file is the permanent, human-readable record of the communication.
- **Binary attachments** — PDFs, images, and other attached files stored natively alongside the text payload. Binary decoupling eliminates split-brain metadata tracking between the text record and its attachments. - **Binary attachments** — PDFs, images, and other attached files stored natively alongside the text payload. Binary decoupling eliminates split-brain metadata tracking between the text record and its attachments.
### The Multi-Path Routing Matrix ### The Multi-Path Routing Matrix
After constructing the bundle, the service routes it based on the payload's origin tags: After constructing the bundle, the service routes it based on the payload's origin tags:
| Route | Destination | Condition | | Route | Destination | Condition |
|---|---|---| |---|---|---|
| Immutable ledger | Cold storage via service-fs | Standard assets (most messages) | | Immutable ledger | Cold storage via service-fs | Standard assets (most messages) |
| Identity ledger | service-people | Sender identity records for CRM downstream ingestion | | Identity ledger | service-people | Sender identity records for CRM downstream ingestion |
| AI synthesis | service-slm, then purge | Consumable media (newsletters, low-retention items) | | AI synthesis | service-slm, then purge | Consumable media (newsletters, low-retention items) |
The routing decision is deterministic and tag-driven. No AI inference is required for the routing step itself. The routing decision is deterministic and tag-driven. No AI inference is required for the routing step itself.
## Configuration ## Configuration
| Parameter | Purpose | | Parameter | Purpose |
|---|---| |---|---|
| Queue path | Input path for Ring 1 payloads (e.g., `assets/tmp-maildir/`) | | Queue path | Input path for Ring 1 payloads (e.g., `assets/tmp-maildir/`) |
| Bundle output path | Filesystem location for completed Entity Bundles | | Bundle output path | Filesystem location for completed Entity Bundles |
| Routing rules | TOML configuration file mapping origin tags to routing destinations | | Routing rules | TOML configuration file mapping origin tags to routing destinations |
| Transaction ID format | Timestamp + routing ID composition format | | Transaction ID format | Timestamp + routing ID composition format |
## See Also ## See Also
- [[service-email]] - [[service-email]]
- [[service-people]] - [[service-people]]
- [[service-slm]] - [[service-slm]]
- [[service-search]] - [[service-search]]
## References ## References
- §XI — Ring 2 knowledge-and-processing architecture - §XI — Ring 2 knowledge-and-processing architecture
- `pointsav-monorepo/service-extraction/` — implementation crate - `pointsav-monorepo/service-extraction/` — implementation crate
- SYS-ADR-07 — structured data never routes through AI (governs the boundary between Ring 2 deterministic routing and Ring 3 AI invocation) - SYS-ADR-07 — structured data never routes through AI (governs the boundary between Ring 2 deterministic routing and Ring 3 AI invocation)
--- ---
*Copyright © 2026 Woodfine Capital Projects Inc. Licensed under [Creative Commons Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/).* *Copyright © 2026 Woodfine Capital Projects Inc. Licensed under [Creative Commons Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0/).*
*Woodfine Capital Projects™, Woodfine Management Corp™, PointSav Digital Systems™, Totebox Orchestration™, and Totebox Archive™ are trademarks of Woodfine Capital Projects Inc., used in Canada, the United States, Latin America, and Europe. All other trademarks are the property of their respective owners.* *Woodfine Capital Projects™, Woodfine Management Corp™, PointSav Digital Systems™, Totebox Orchestration™, and Totebox Archive™ are trademarks of Woodfine Capital Projects Inc., used in Canada, the United States, Latin America, and Europe. All other trademarks are the property of their respective owners.*