Diff: services/yoyo-daily-enrichment-cycle

From 3f1e0da to 3f1e0da

+0 / −0 lines

Before	After
---	---
schema: foundry-doc-v1	schema: foundry-doc-v1
title: "Yo-Yo Daily Enrichment Cycle"	title: "Yo-Yo Daily Enrichment Cycle"
slug: yoyo-daily-enrichment-cycle	slug: yoyo-daily-enrichment-cycle
category: services	category: services
type: topic	type: topic
status: stable	status: stable
bcsc_class: no-disclosure-implication	bcsc_class: no-disclosure-implication
last_edited: 2026-06-11	last_edited: 2026-06-11
editor: pointsav-engineering	editor: pointsav-engineering
paired_with: yoyo-daily-enrichment-cycle.es.md	paired_with: yoyo-daily-enrichment-cycle.es.md
---	---

The Yo-Yo daily enrichment cycle is the automated batch window that runs a GPU-accelerated	The Yo-Yo daily enrichment cycle is the automated batch window that runs a GPU-accelerated
inference VM once per day to enrich the DataGraph and accumulate training data for the	inference VM once per day to enrich the DataGraph and accumulate training data for the
local language model. The cycle runs at a fixed time, enforces a hard cost cap, and	local language model. The cycle runs at a fixed time, enforces a hard cost cap, and
terminates the VM whether the work finishes early or reaches the cap.	terminates the VM whether the work finishes early or reaches the cap.

## Purpose	## Purpose

The workspace VM runs a 7-billion-parameter language model (OLMo 2 7B) on CPU for	The workspace VM runs a 7-billion-parameter language model (OLMo 2 7B) on CPU for
interactive use. This model performs adequately for short prompts but extracts entities	interactive use. This model performs adequately for short prompts but extracts entities
from documents with lower accuracy than a larger GPU-resident model. The daily cycle	from documents with lower accuracy than a larger GPU-resident model. The daily cycle
addresses this gap by starting a separate GPU VM — the Yo-Yo batch node — that loads	addresses this gap by starting a separate GPU VM — the Yo-Yo batch node — that loads
a 32-billion-parameter model and processes a queue of documents that accumulated during	a 32-billion-parameter model and processes a queue of documents that accumulated during
the day.	the day.

The products of each cycle are:	The products of each cycle are:
- Additional named entities added to the DataGraph (graph store)	- Additional named entities added to the DataGraph (graph store)
- Direct Preference Optimisation (DPO) training pairs written to the enrichment corpus	- Direct Preference Optimisation (DPO) training pairs written to the enrichment corpus

Each DPO pair records what the 32B model extracted as the preferred output and what the	Each DPO pair records what the 32B model extracted as the preferred output and what the
7B model extracted as the baseline, enabling the 7B model to be fine-tuned toward the	7B model extracted as the baseline, enabling the 7B model to be fine-tuned toward the
larger model's extraction quality over successive training runs.	larger model's extraction quality over successive training runs.

## The eight phases	## The eight phases

The cycle is a single Bash script (`yoyo-daily-cycle.sh`) that executes eight sequential	The cycle is a single Bash script (`yoyo-daily-cycle.sh`) that executes eight sequential
phases. The script writes a timestamped log file for each run.	phases. The script writes a timestamped log file for each run.

Phase 1 — VM start. If the batch VM is not already running, a `gcloud instances start`	Phase 1 — VM start. If the batch VM is not already running, a `gcloud instances start`
command is issued. The VM boots from a persistent disk that retains the model weights and	command is issued. The VM boots from a persistent disk that retains the model weights and
the inference server configuration from the previous cycle.	the inference server configuration from the previous cycle.

Phase 2 — Inference server health. The script polls the llama-server health endpoint	Phase 2 — Inference server health. The script polls the llama-server health endpoint
(`/health`) at ten-second intervals until it returns `{"status":"ok"}`. Startup consistently	(`/health`) at ten-second intervals until it returns `{"status":"ok"}`. Startup consistently
takes approximately 170 seconds from power-on to first healthy response. If the server	takes approximately 170 seconds from power-on to first healthy response. If the server
does not respond within ten minutes, the cycle aborts and stops the VM.	does not respond within ten minutes, the cycle aborts and stops the VM.

Phase 3 — Tier B circuit. The local inference gateway maintains a circuit breaker for	Phase 3 — Tier B circuit. The local inference gateway maintains a circuit breaker for
the Yo-Yo node. The script waits up to two minutes for the circuit to close, confirming	the Yo-Yo node. The script waits up to two minutes for the circuit to close, confirming
the gateway has registered the VM as reachable. If the circuit does not close, the cycle	the gateway has registered the VM as reachable. If the circuit does not close, the cycle
continues with a Tier A fallback warning logged.	continues with a Tier A fallback warning logged.

Phase 4 — Enrichment drain. For 40 percent of the cycle budget (18 minutes at the	Phase 4 — Enrichment drain. For 40 percent of the cycle budget (18 minutes at the
45-minute cap), the script waits while the gateway processes the pending enrichment queue.	45-minute cap), the script waits while the gateway processes the pending enrichment queue.
During this window, the content service sends document chunks to the Yo-Yo node for	During this window, the content service sends document chunks to the Yo-Yo node for
entity extraction and writes DPO pairs to the enrichment corpus. Progress is logged every	entity extraction and writes DPO pairs to the enrichment corpus. Progress is logged every
60 seconds with entity counts, enrichment pair counts, GPU utilisation, and VRAM usage.	60 seconds with entity counts, enrichment pair counts, GPU utilisation, and VRAM usage.

Phase 5 — Corpus threshold check. After enrichment, `corpus-threshold.py` runs to	Phase 5 — Corpus threshold check. After enrichment, `corpus-threshold.py` runs to
count accumulated training-ready data. If counts exceed the configured threshold, the	count accumulated training-ready data. If counts exceed the configured threshold, the
script writes dated training marker files to `data/training-pending/`. These markers	script writes dated training marker files to `data/training-pending/`. These markers
are the input to Phase 6.	are the input to Phase 6.

Phase 6 — LoRA training trigger. Three gates must all pass for training to run:	Phase 6 — LoRA training trigger. Three gates must all pass for training to run:
training markers must be present, the ML libraries must be installed in the training	training markers must be present, the ML libraries must be installed in the training
virtual environment on the batch VM, and an operator-authored approval tag must exist	virtual environment on the batch VM, and an operator-authored approval tag must exist
for the current date. If all three pass, the script stops the inference server to free	for the current date. If all three pass, the script stops the inference server to free
approximately 16 gigabytes of VRAM, then invokes `run-dpo-training.py` over SSH with a	approximately 16 gigabytes of VRAM, then invokes `run-dpo-training.py` over SSH with a
45-percent budget (20 minutes at the 45-minute cap). The `--resume` flag accumulates	45-percent budget (20 minutes at the 45-minute cap). The `--resume` flag accumulates
daily checkpoints so each run extends the previous day's training rather than starting	daily checkpoints so each run extends the previous day's training rather than starting
from scratch.	from scratch.

Phase 7 — GCS sync. If the `SLM_YOYO_WEIGHTS_GCS_BUCKET` environment variable is	Phase 7 — GCS sync. If the `SLM_YOYO_WEIGHTS_GCS_BUCKET` environment variable is
set and training markers are present, the enrichment corpus is synchronised to the	set and training markers are present, the enrichment corpus is synchronised to the
configured Cloud Storage bucket. This step is currently disabled pending a future session	configured Cloud Storage bucket. This step is currently disabled pending a future session
that configures the bucket.	that configures the bucket.

Phase 8 — Hard stop. The inference server is stopped via SSH, the VM is stopped via	Phase 8 — Hard stop. The inference server is stopped via SSH, the VM is stopped via
`gcloud instances stop`, and the script waits up to three minutes for the VM to reach	`gcloud instances stop`, and the script waits up to three minutes for the VM to reach
`TERMINATED` status. A summary line records total elapsed time, entity delta, DPO pair	`TERMINATED` status. A summary line records total elapsed time, entity delta, DPO pair
delta, and VM final status.	delta, and VM final status.

## Budget and cost	## Budget and cost

The daily cycle operates under a 45-minute hard cap. The VM is stopped unconditionally	The daily cycle operates under a 45-minute hard cap. The VM is stopped unconditionally
at the end of Phase 8 regardless of whether phases completed normally.	at the end of Phase 8 regardless of whether phases completed normally.

\| Item \| Value \|	\| Item \| Value \|
\|---\|---\|	\|---\|---\|
\| VM type \| g2-standard-4 with NVIDIA L4 24 GB \|	\| VM type \| g2-standard-4 with NVIDIA L4 24 GB \|
\| Zone \| us-central1-a \|	\| Zone \| us-central1-a \|
\| Running cost \| approximately $0.71 per hour \|	\| Running cost \| approximately $0.71 per hour \|
\| Cycle cost at 45-minute cap \| approximately $0.53 per cycle \|	\| Cycle cost at 45-minute cap \| approximately $0.53 per cycle \|
\| TERMINATED cost \| $0.00 \|	\| TERMINATED cost \| $0.00 \|
\| Monthly cost (daily cycles) \| approximately $16 per month \|	\| Monthly cost (daily cycles) \| approximately $16 per month \|

A kill switch file (`/srv/foundry/data/yoyo-disabled`) suppresses all VM lifecycle	A kill switch file (`/srv/foundry/data/yoyo-disabled`) suppresses all VM lifecycle
operations immediately. Creating the file prevents Phase 1 from issuing a start command.	operations immediately. Creating the file prevents Phase 1 from issuing a start command.
Removing the file resumes normal operation on the next scheduled cycle.	Removing the file resumes normal operation on the next scheduled cycle.

An idle monitor timer checks every five minutes whether the VM has been running idle for	An idle monitor timer checks every five minutes whether the VM has been running idle for
more than 30 minutes. If the daily cycle fails to stop the VM, the idle monitor will stop	more than 30 minutes. If the daily cycle fails to stop the VM, the idle monitor will stop
it as a safety backstop, preventing uncapped cost accumulation.	it as a safety backstop, preventing uncapped cost accumulation.

## DPO pair format	## DPO pair format

Each enrichment DPO pair is a JSON file written to the feedback directory. The format	Each enrichment DPO pair is a JSON file written to the feedback directory. The format
is compatible with the TRL DPOTrainer:	is compatible with the TRL DPOTrainer:

```json	```json
{	{
"prompt": "<document chunk text>",	"prompt": "<document chunk text>",
"chosen": "[{\"classification\":\"Person\",\"entity_name\":\"...\"}]",	"chosen": "[{\"classification\":\"Person\",\"entity_name\":\"...\"}]",
"rejected": "[{\"classification\":\"Person\",\"entity_name\":\"...\"}]",	"rejected": "[{\"classification\":\"Person\",\"entity_name\":\"...\"}]",
"source_type": "datagraph-enrichment",	"source_type": "datagraph-enrichment",
"worm_id": "<document identifier>",	"worm_id": "<document identifier>",
"timestamp": "<ISO 8601>"	"timestamp": "<ISO 8601>"
}	}
```	```

`chosen` is the 32B model's extraction. `rejected` is the 7B model's extraction. A pair	`chosen` is the 32B model's extraction. `rejected` is the 7B model's extraction. A pair
is only written when both models found at least one entity and the results differ after	is only written when both models found at least one entity and the results differ after
normalisation. Pairs where the 7B model found nothing are discarded — they contain no	normalisation. Pairs where the 7B model found nothing are discarded — they contain no
genuine preference signal.	genuine preference signal.

## Verified test results (2026-06-09)	## Verified test results (2026-06-09)

Three 10-minute test cycles confirmed the pipeline operates correctly end-to-end.	Three 10-minute test cycles confirmed the pipeline operates correctly end-to-end.

\| Cycle \| Duration \| Entity delta \| DPO pairs added \| VM final status \|	\| Cycle \| Duration \| Entity delta \| DPO pairs added \| VM final status \|
\|---\|---\|---\|---\|---\|	\|---\|---\|---\|---\|---\|
\| 1 \| 10 min 43 s \| +7 \| +6 \| TERMINATED \|	\| 1 \| 10 min 43 s \| +7 \| +6 \| TERMINATED \|
\| 2 \| 9 min 12 s \| +8 \| +4 \| TERMINATED \|	\| 2 \| 9 min 12 s \| +8 \| +4 \| TERMINATED \|
\| 3 \| 10 min 38 s \| +22 \| +8 \| TERMINATED \|	\| 3 \| 10 min 38 s \| +22 \| +8 \| TERMINATED \|

GPU diagnostics in cycle 3: 99% utilisation, 16,151 of 23,034 MB VRAM in use, 73°C.	GPU diagnostics in cycle 3: 99% utilisation, 16,151 of 23,034 MB VRAM in use, 73°C.