Diff: services/service-slm-graph-store-migration.es

From b67c6ba to b67c6ba

+118 / −0 lines

Before	After
	---
	schema: foundry-doc-v1
	title: "service-slm graph store migration"
	slug: service-slm-graph-store-migration
	category: services
	type: concept
	quality: pre-build
	status: pre-build
	audience: vendor-public
	bcsc_class: current-fact
	language_protocol: PROSE-TOPIC
	last_edited: 2026-05-24
	editor: pointsav-engineering
	paired_with: service-slm-graph-store-migration.es.md
	short_description: "service-slm migrated its graph store from LadybugDB to SQLite for fleet nodes and integrates a nightly DataGraph rebuild that processes the operator data corpus through the Doorman into the property graph used for inference context injection."
	cites: []
	---

	Each night, `jennifer-datagraph-rebuild.sh` processes the operator data
	corpus and writes extracted named entities to a property graph stored in
	LadybugDB. This property graph — the deployment DataGraph — is the entity layer
	that service-content uses to inject structured business context into inference
	requests. The rebuild runs as Phase 1 of the Elastic Compute #1 nightly window, before
	the training phase claims the GPU. The deployment DataGraph is live, with an
	11 MB LadybugDB file currently active at service-content.

	## What the DataGraph contains

	The deployment DataGraph is a property graph of named business entities
	extracted from the operator deployment's data corpus. The graph holds five
	entity classifications: Person (staff, contacts, counterparties), Company
	(vendors, customers, partner organisations), Project (active and historical
	engagements), Account (financial accounts and ledger references), and
	Location (offices, sites, and operational addresses). These entities are
	extracted from three document streams: meeting transcript markdown files
	from the minutebook asset directory, research and background YAML and markdown
	files from the service-agents directory, and contact source JSON records from
	the service-people directory.

	## What the nightly rebuild does

	For each unprocessed document, the rebuild script calls
	`POST :9080/v1/chat/completions` through the Doorman endpoint, passing the
	document text with a JSON Schema grammar constraint. The language model —
	OLMo 3 32B Think running on Elastic Compute #1 via vLLM — returns a structured JSON
	array of entity objects. Each object carries the entity name, classification,
	confidence score, and optional role, location, and contact vectors. The script
	then calls `POST :9081/v1/graph/mutate` on service-content to write those
	entities into LadybugDB. The health probe at the end of the cycle queries
	service-content for the current entity count and writes a summary JSON file
	at `$FOUNDRY_ROOT/data/datagraph-health.json`.

	The script processes three document batches each run: the full minutebook
	asset tree, the full service-agents tree, and the 50 most recent unprocessed
	service-people JSON files. A randomised inter-document delay (0.3 to 1.5
	seconds) prevents the Doorman from receiving a burst of requests that could
	interfere with the training phase startup.

	## The routing parity principle

	The `jennifer-datagraph-rebuild.sh` script calls only the same two REST API
	endpoints that any operator or community member running service-slm and
	service-content would call from their own automation:

	- `POST :9080/v1/chat/completions` — entity extraction through Doorman
	- `POST :9081/v1/graph/mutate` — entity write through service-content

	There is no file-watcher shortcut, no internal gRPC bypass, and no direct
	database write. This is a deliberate design decision. If the rebuild script
	fails, the failure indicates a real defect in service-slm or service-content
	that would also affect any operator or customer running the same API surface.
	The nightly rebuild functions as a full-stack integration test that runs
	against production services on production data every night. Failures are
	explicit and immediately actionable rather than hidden in an internal path
	that real callers would never exercise.

	## Idempotency

	The script tracks processed documents using a local ledger at
	`$FOUNDRY_ROOT/data/datagraph-processed.txt`. Each document is identified by
	a hash of its file content, prefixed with a source tag (`mk-` for minutebook,
	`ag-` for service-agents, `sp-` for service-people). Before processing any
	document, the script checks whether its identifier appears in the ledger. If
	it does, the document is skipped. After a successful `graph/mutate` call, the
	identifier is appended to the ledger. This mechanism ensures that documents
	are not re-processed across multiple nightly runs, even if the same content
	is present in the source directories.

	The ledger is append-only and not pruned automatically. If service-content
	is restarted and the graph is rebuilt from scratch, the ledger can be cleared
	to force a full re-extraction on the next nightly run.

	## Graph context injection

	The deployment DataGraph is not a static reference store. service-content
	queries it before each inference request. When the Doorman receives a
	completion request from an operator or application, service-content retrieves
	entities relevant to the request context — based on module ID, entity
	classification, and confidence thresholds — and injects them into the system
	message as a structured entity context block. The language model receives
	structured business context (who the relevant people are, what projects are
	active, which companies are counterparties) without requiring that structured
	data to cross the external model boundary. The graph stays within the
	deployment boundary; only the injected prose context leaves it.

	## Current status and gate criterion

	The deployment DataGraph is live. Three consecutive nightly runs reporting
	HEALTHY status — defined as a non-negative entity count delta and a successful
	round trip on both the extraction and mutation endpoints — are the intended
	criterion before the DataGraph pattern is extended to larger operational
	contexts. That gate has not yet been met; the rebuild pipeline is in its
	initial operational period.

	## See also

	- [[elastic-compute-lora-training-pipeline]] — Phase 2 of the same nightly window (LoRA adapter training)
	- [[service-slm]] — the service that orchestrates the full nightly pipeline