Diff: substrate/ontological-datagraph

From 1c02ec1 to 1c02ec1

+0 / −0 lines

Before	After
---	---
schema: foundry-doc-v1	schema: foundry-doc-v1
title: "The organizational knowledge graph — ontological memory for business operations"	title: "The organizational knowledge graph — ontological memory for business operations"
slug: ontological-datagraph	slug: ontological-datagraph
category: substrate	category: substrate
type: topic	type: topic
content_type: topic	content_type: topic
quality: complete	quality: complete
short_description: "An organizational knowledge graph that stores structured representations of people, companies, projects, and relationships — providing the persistent semantic memory layer that enables AI inference engines to answer queries about business state without re-reading source documents."	short_description: "An organizational knowledge graph that stores structured representations of people, companies, projects, and relationships — providing the persistent semantic memory layer that enables AI inference engines to answer queries about business state without re-reading source documents."
status: active	status: active
bcsc_class: public-disclosure-safe	bcsc_class: public-disclosure-safe
last_edited: 2026-06-09	last_edited: 2026-06-09
editor: pointsav-engineering	editor: pointsav-engineering
cites: []	cites: []
references: []	references: []
paired_with: ontological-datagraph.es.md	paired_with: ontological-datagraph.es.md
---	---

# The organizational knowledge graph — ontological memory for business operations	# The organizational knowledge graph — ontological memory for business operations

An organizational knowledge graph stores what a business knows about itself: who its	An organizational knowledge graph stores what a business knows about itself: who its
people, companies, and projects are; how they relate to one another; what decisions	people, companies, and projects are; how they relate to one another; what decisions
have been made and by whom; which policies govern which activities. This structured	have been made and by whom; which policies govern which activities. This structured
memory is available to every AI inference request, injected as context before the	memory is available to every AI inference request, injected as context before the
model produces its response.	model produces its response.

The graph answers a class of question that a vector similarity search cannot: not just	The graph answers a class of question that a vector similarity search cannot: not just
"what documents mention ACME Corp?" but "what is ACME Corp, who do we know there,	"what documents mention ACME Corp?" but "what is ACME Corp, who do we know there,
what contract governs our relationship, and what decisions have we made about their	what contract governs our relationship, and what decisions have we made about their
invoices?" The traversal follows edges. The answer emerges from structure, not	invoices?" The traversal follows edges. The answer emerges from structure, not
keyword proximity.	keyword proximity.

## One graph per deployment node	## One graph per deployment node

A deployment node maintains exactly one organizational knowledge graph. All services	A deployment node maintains exactly one organizational knowledge graph. All services
running on that node contribute entities to this single store, scoped by a	running on that node contribute entities to this single store, scoped by a
module identifier that keeps each domain's data isolated within the same physical	module identifier that keeps each domain's data isolated within the same physical
database.	database.

This design supports cross-domain reasoning without duplication. When a bookkeeping	This design supports cross-domain reasoning without duplication. When a bookkeeping
service writes "ACME Corp is a vendor with net-30 payment terms" and a document	service writes "ACME Corp is a vendor with net-30 payment terms" and a document
extraction service writes "ACME Corp is headquartered in Toronto," both facts exist	extraction service writes "ACME Corp is headquartered in Toronto," both facts exist
in the same graph, attached to the same entity. A query about ACME Corp retrieves	in the same graph, attached to the same entity. A query about ACME Corp retrieves
both facts in a single traversal.	both facts in a single traversal.

A separate graph per service would require the inference router to query multiple	A separate graph per service would require the inference router to query multiple
sources, merge results, and resolve conflicts — complexity that produces worse	sources, merge results, and resolve conflicts — complexity that produces worse
answers at higher operational cost. Industry-scale knowledge graph systems converge	answers at higher operational cost. Industry-scale knowledge graph systems converge
on a unified semantic layer regardless of how many services produce the underlying data.	on a unified semantic layer regardless of how many services produce the underlying data.

## What belongs in the graph	## What belongs in the graph

The graph stores ontological facts: what entities exist, how they relate, and what	The graph stores ontological facts: what entities exist, how they relate, and what
is true about them at a given point in time. It does not store transactional records.	is true about them at a given point in time. It does not store transactional records.

In the graph:	In the graph:
- An organization is a vendor (entity with a relationship attribute).	- An organization is a vendor (entity with a relationship attribute).
- A contract exists between two parties with specific terms (entities with a relationship carrying properties).	- A contract exists between two parties with specific terms (entities with a relationship carrying properties).
- A decision was made by a named person under a named policy (entities with typed edges).	- A decision was made by a named person under a named policy (entities with typed edges).
- A property is owned by a company with a specific classification (entities with a property edge).	- A property is owned by a company with a specific classification (entities with a property edge).

Not in the graph:	Not in the graph:
- Individual invoice line items (these are transactional records; they belong in an immutable append-only ledger).	- Individual invoice line items (these are transactional records; they belong in an immutable append-only ledger).
- Journal entries (double-entry accounting records; belong in the bookkeeping ledger).	- Journal entries (double-entry accounting records; belong in the bookkeeping ledger).
- Raw document text (belongs in the document store; only the extracted entities from that text belong in the graph).	- Raw document text (belongs in the document store; only the extracted entities from that text belong in the graph).

This distinction matters for auditability. Transactional records are immutable and	This distinction matters for auditability. Transactional records are immutable and
must remain in a tamper-evident append-only store. The graph is mutable: facts	must remain in a tamper-evident append-only store. The graph is mutable: facts
are updated as the organization's reality changes.	are updated as the organization's reality changes.

## Entity types	## Entity types

The graph is configured through ontology files loaded at startup. Each deployment	The graph is configured through ontology files loaded at startup. Each deployment
node can define entity classifications appropriate for its business domain. The	node can define entity classifications appropriate for its business domain. The
base classifications present in every deployment cover the fundamental organizational	base classifications present in every deployment cover the fundamental organizational
primitives:	primitives:

- Person — named individuals; carries role, contact information, and organizational	- Person — named individuals; carries role, contact information, and organizational
affiliation.	affiliation.
- Company — registered organizations; carries classification (vendor, customer,	- Company — registered organizations; carries classification (vendor, customer,
partner, regulator), and relationship attributes such as contract terms.	partner, regulator), and relationship attributes such as contract terms.
- Project — named initiatives; carries status, participants, and governing policies.	- Project — named initiatives; carries status, participants, and governing policies.
- Account — financial and service accounts; carries balance class and relationship	- Account — financial and service accounts; carries balance class and relationship
to contract.	to contract.
- Location — geographic places and addresses; carries jurisdiction and physical	- Location — geographic places and addresses; carries jurisdiction and physical
attributes.	attributes.

Additional classifications are added through ontology CSV files. A legal practice	Additional classifications are added through ontology CSV files. A legal practice
might add Case, Regulation, and Judgment. A property manager might add Property,	might add Case, Regulation, and Judgment. A property manager might add Property,
Lease, and Tenant. A manufacturing business might add Equipment, WorkOrder, and	Lease, and Tenant. A manufacturing business might add Equipment, WorkOrder, and
Specification. Each addition extends the graph's reasoning capability for that	Specification. Each addition extends the graph's reasoning capability for that
domain without modifying code.	domain without modifying code.

## Temporal validity	## Temporal validity

Every fact in the graph carries a creation timestamp. Facts about entities that	Every fact in the graph carries a creation timestamp. Facts about entities that
change over time — such as who holds a role, or what terms a contract carries —	change over time — such as who holds a role, or what terms a contract carries —
can be superseded rather than overwritten. The graph retains the prior fact with	can be superseded rather than overwritten. The graph retains the prior fact with
its validity window. A query can ask either "what is true now?" or "what was	its validity window. A query can ask either "what is true now?" or "what was
true on a given date?"	true on a given date?"

This temporal property is valuable for auditing and for training. A model trained	This temporal property is valuable for auditing and for training. A model trained
on facts that were accurate when the training data was written, but are no longer	on facts that were accurate when the training data was written, but are no longer
current, produces confident incorrect answers. Temporal validity allows the graph	current, produces confident incorrect answers. Temporal validity allows the graph
to serve accurate context even as the organization changes.	to serve accurate context even as the organization changes.

## Multi-hop traversal	## Multi-hop traversal

The graph is designed for traversal, not just lookup. A lookup retrieves one entity	The graph is designed for traversal, not just lookup. A lookup retrieves one entity
by name. A traversal follows edges from that entity to connected entities, and from	by name. A traversal follows edges from that entity to connected entities, and from
those to further connections.	those to further connections.

Example traversal:	Example traversal:

Query: "What policies govern procurement decisions on this project?"	Query: "What policies govern procurement decisions on this project?"

1. Start at the Project entity.	1. Start at the Project entity.
2. Follow `governed_by` edges to Policy entities.	2. Follow `governed_by` edges to Policy entities.
3. Follow `defines_exceptions` edges from each Policy to Decision entities.	3. Follow `defines_exceptions` edges from each Policy to Decision entities.
4. Follow `approved_by` edges from each Decision to Person entities.	4. Follow `approved_by` edges from each Decision to Person entities.

The result: the full governance chain, retrieved in a single structured query. A	The result: the full governance chain, retrieved in a single structured query. A
language model with this context in its prompt can answer questions about governance,	language model with this context in its prompt can answer questions about governance,
exceptions, and responsible parties without needing to synthesize that knowledge	exceptions, and responsible parties without needing to synthesize that knowledge
from unstructured text.	from unstructured text.

This is the quality advantage that cannot be replicated by a general-purpose AI	This is the quality advantage that cannot be replicated by a general-purpose AI
service: that service has no knowledge of this organization's structure. The graph	service: that service has no knowledge of this organization's structure. The graph
encodes that structure explicitly and makes it available at inference time.	encodes that structure explicitly and makes it available at inference time.

## How entities enter the graph	## How entities enter the graph

Entities enter the graph through an extraction pipeline. Documents, emails, meeting	Entities enter the graph through an extraction pipeline. Documents, emails, meeting
notes, and other prose sources arrive in a watched input directory. The extraction	notes, and other prose sources arrive in a watched input directory. The extraction
service reads each source, sends the text to the inference router for structured	service reads each source, sends the text to the inference router for structured
entity extraction using a grammar-constrained schema, and writes the resulting	entity extraction using a grammar-constrained schema, and writes the resulting
entities to the graph through the router's mutation endpoint.	entities to the graph through the router's mutation endpoint.

The extraction quality depends on the inference tier. The local compact model	The extraction quality depends on the inference tier. The local compact model
(Tier A) extracts entities at lower confidence. The burst GPU node (Tier B) extracts	(Tier A) extracts entities at lower confidence. The burst GPU node (Tier B) extracts
at higher confidence using larger context windows and strict output constraints.	at higher confidence using larger context windows and strict output constraints.
Tier A extraction is useful for rapid coverage; Tier B extraction is used for the	Tier A extraction is useful for rapid coverage; Tier B extraction is used for the
canonical organizational record.	canonical organizational record.

Every extraction is logged with a source reference and a confidence score. Entities	Every extraction is logged with a source reference and a confidence score. Entities
extracted from authoritative sources (executed contracts, filed documents, official	extracted from authoritative sources (executed contracts, filed documents, official
registrations) carry higher confidence than those extracted from informal	registrations) carry higher confidence than those extracted from informal
correspondence.	correspondence.

## Context injection at inference time	## Context injection at inference time

Before the inference router dispatches any request, it queries the organizational	Before the inference router dispatches any request, it queries the organizational
graph for entities relevant to the current request. The query is a substring match	graph for entities relevant to the current request. The query is a substring match
against the last portion of the user's message. Matching entities are formatted as a	against the last portion of the user's message. Matching entities are formatted as a
structured context block and prepended to the system prompt.	structured context block and prepended to the system prompt.

The model receives this context transparently. It does not need to be prompted to	The model receives this context transparently. It does not need to be prompted to
"use the knowledge graph." The context is simply present, as if the model had been	"use the knowledge graph." The context is simply present, as if the model had been
briefed on the relevant organizational relationships before the conversation began.	briefed on the relevant organizational relationships before the conversation began.

This injection is non-fatal. If the graph service is unavailable or returns no	This injection is non-fatal. If the graph service is unavailable or returns no
matches, the request proceeds without additional context. A circuit breaker on the	matches, the request proceeds without additional context. A circuit breaker on the
graph query path prevents a slow graph service from blocking inference.	graph query path prevents a slow graph service from blocking inference.

## Privacy and sovereignty	## Privacy and sovereignty

The organizational knowledge graph contains sensitive business information: personnel	The organizational knowledge graph contains sensitive business information: personnel
relationships, contract terms, decision histories, financial account structures. This	relationships, contract terms, decision histories, financial account structures. This
data should not leave the organization's control.	data should not leave the organization's control.

The graph runs embedded within the deployment node. Its contents are never sent to	The graph runs embedded within the deployment node. Its contents are never sent to
an external inference provider as training data. Context injected into prompts for	an external inference provider as training data. Context injected into prompts for
external Tier C requests is subject to the organization's data classification policy;	external Tier C requests is subject to the organization's data classification policy;
the gateway enforces this through the structured-data boundary rule, which prevents	the gateway enforces this through the structured-data boundary rule, which prevents
raw business records from crossing the external inference boundary.	raw business records from crossing the external inference boundary.

The organization owns the graph database file, the ontology definitions, the entity	The organization owns the graph database file, the ontology definitions, the entity
data, and the extraction history. These assets are portable: they can be backed up,	data, and the extraction history. These assets are portable: they can be backed up,
migrated, and restored without dependency on any third-party service.	migrated, and restored without dependency on any third-party service.