Diff: substrate/location-intelligence-substrate
From 51e7724 to 51e7724
+0 / −0 lines
| Before | After |
|---|---|
| --- | --- |
| schema: foundry-doc-v1 | schema: foundry-doc-v1 |
| title: "Location intelligence substrate" | title: "Location intelligence substrate" |
| slug: location-intelligence-substrate | slug: location-intelligence-substrate |
| short_description: "A flat-file, open-GIS architecture enabling customers to own geographic datasets end-to-end using Apache-licensed open data and a Rust-aligned open-source rendering stack, with retail co-location analysis as the first deployed surface." | short_description: "A flat-file, open-GIS architecture enabling customers to own geographic datasets end-to-end using Apache-licensed open data and a Rust-aligned open-source rendering stack, with retail co-location analysis as the first deployed surface." |
| category: substrate | category: substrate |
| type: topic | type: topic |
| quality: complete | quality: complete |
| status: active | status: active |
| bcsc_class: public-disclosure-safe | bcsc_class: public-disclosure-safe |
| last_edited: 2026-05-09 | last_edited: 2026-05-09 |
| editor: pointsav-engineering | editor: pointsav-engineering |
| paired_with: location-intelligence-substrate.es.md | paired_with: location-intelligence-substrate.es.md |
| references: | references: |
| - id: 1 | - id: 1 |
| text: "Overture Maps Foundation — GeoParquet places schema. overturemaps.org" | text: "Overture Maps Foundation — GeoParquet places schema. overturemaps.org" |
| - id: 2 | - id: 2 |
| text: "Foursquare Open Source Places — 100M+ POIs, Apache 2.0. huggingface.co/datasets/foursquare" | text: "Foursquare Open Source Places — 100M+ POIs, Apache 2.0. huggingface.co/datasets/foursquare" |
| - id: 3 | - id: 3 |
| text: "GeoParquet specification — OGC incubating standard. geoparquet.org" | text: "GeoParquet specification — OGC incubating standard. geoparquet.org" |
| - id: 4 | - id: 4 |
| text: "FlatGeobuf — Hilbert R-tree packed flat-file format. flatgeobuf.org" | text: "FlatGeobuf — Hilbert R-tree packed flat-file format. flatgeobuf.org" |
| - id: 5 | - id: 5 |
| text: "MapLibre GL JS — Community-driven vector-tile renderer. maplibre.org" | text: "MapLibre GL JS — Community-driven vector-tile renderer. maplibre.org" |
| - id: 6 | - id: 6 |
| text: "Martin tile server — Rust tile server (MapLibre Foundation). maplibre.org/martin" | text: "Martin tile server — Rust tile server (MapLibre Foundation). maplibre.org/martin" |
| - id: 7 | - id: 7 |
| text: "PMTiles — Single-file tile archive with HTTP range requests. protomaps.com/pmtiles" | text: "PMTiles — Single-file tile archive with HTTP range requests. protomaps.com/pmtiles" |
| - id: 8 | - id: 8 |
| text: "NI 51-102 Continuous Disclosure Obligations — BCSC" | text: "NI 51-102 Continuous Disclosure Obligations — BCSC" |
| - id: 9 | - id: 9 |
| text: "OSC Staff Notice 51-721 Forward-Looking Information Disclosure" | text: "OSC Staff Notice 51-721 Forward-Looking Information Disclosure" |
| --- | --- |
| The Location Intelligence Substrate is a flat-file, open-GIS architecture that lets customers own their geographic datasets end-to-end — no tile API billing, no warehouse licensing, no cloud-vendor lock-in. The substrate is built on Apache-licensed open-data foundations (Overture Maps Foundation, Foursquare Open Source Places) and rendered via a Rust-aligned open-source stack (MapLibre GL JS, Martin tile server, PMTiles).[^1][^2] | The Location Intelligence Substrate is a flat-file, open-GIS architecture that lets customers own their geographic datasets end-to-end — no tile API billing, no warehouse licensing, no cloud-vendor lock-in. The substrate is built on Apache-licensed open-data foundations (Overture Maps Foundation, Foursquare Open Source Places) and rendered via a Rust-aligned open-source stack (MapLibre GL JS, Martin tile server, PMTiles).[^1][^2] |
| The first deployed surface is `gis.woodfinegroup.com` — a co-location map showing retail anchor co-presence across the United States, Canada, Mexico, and Spain. | The first deployed surface is `gis.woodfinegroup.com` — a co-location map showing retail anchor co-presence across the United States, Canada, Mexico, and Spain. |
| ## Architecture — flat-file vs database | ## Architecture — flat-file vs database |
| Three options exist for canonical storage of geographic records: | Three options exist for canonical storage of geographic records: |
| **Flat-file canonical** — JSONL for human-diffable change tracking; GeoParquet for performant analytic reads; FlatGeobuf for browser-side spatial-bbox streaming. GeoParquet is an OGC incubating standard that adds Point/Line/Polygon types to columnar Parquet.[^3] FlatGeobuf carries a packed Hilbert R-tree at the file header that enables a browser to stream only the features inside the current viewport over HTTP range requests.[^4] Advantages: sovereign by construction, version-controllable, customer-portable, zero infrastructure to operate. Limitation: writes are single-author-at-a-time; concurrent online edits would race. | **Flat-file canonical** — JSONL for human-diffable change tracking; GeoParquet for performant analytic reads; FlatGeobuf for browser-side spatial-bbox streaming. GeoParquet is an OGC incubating standard that adds Point/Line/Polygon types to columnar Parquet.[^3] FlatGeobuf carries a packed Hilbert R-tree at the file header that enables a browser to stream only the features inside the current viewport over HTTP range requests.[^4] Advantages: sovereign by construction, version-controllable, customer-portable, zero infrastructure to operate. Limitation: writes are single-author-at-a-time; concurrent online edits would race. |
| **Database canonical** — PostgreSQL plus a spatial extension, the genre's industry default. Advantages: rich spatial SQL, multi-writer concurrency, production-tested operations. Limitation: the customer's data lives in a running daemon they must operate; portability requires a dump-and-restore that is not a directory move. | **Database canonical** — PostgreSQL plus a spatial extension, the genre's industry default. Advantages: rich spatial SQL, multi-writer concurrency, production-tested operations. Limitation: the customer's data lives in a running daemon they must operate; portability requires a dump-and-restore that is not a directory move. |
| **Hybrid** — flat-file canonical, ephemeral database materialised from the flat-file as a query cache. Matches the vault-as-canonical, derived-tables-as-cache approach the platform uses for bookkeeping. | **Hybrid** — flat-file canonical, ephemeral database materialised from the flat-file as a query cache. Matches the vault-as-canonical, derived-tables-as-cache approach the platform uses for bookkeeping. |
| For workloads of tens of thousands of POI records across a small number of countries, with infrequent batch writes and read-mostly queries, flat-file is sufficient and is the architecture both Foursquare and Overture Maps Foundation chose for their substrate releases.[^1][^2] The recommendation: GeoParquet as the canonical at-rest format (one file per country per service, rolled monthly), JSONL siblings for git-tracked human-diffable history, FlatGeobuf as the browser-streamable derivative. | For workloads of tens of thousands of POI records across a small number of countries, with infrequent batch writes and read-mostly queries, flat-file is sufficient and is the architecture both Foursquare and Overture Maps Foundation chose for their substrate releases.[^1][^2] The recommendation: GeoParquet as the canonical at-rest format (one file per country per service, rolled monthly), JSONL siblings for git-tracked human-diffable history, FlatGeobuf as the browser-streamable derivative. |
| ## Map tile and layer delivery | ## Map tile and layer delivery |
| The rendering stack uses MapLibre GL JS in the browser — a community-driven open-source vector-tile renderer that supports WebGL, dynamic styling, smooth animation, and 3D without per-traffic licence cost.[^5] | The rendering stack uses MapLibre GL JS in the browser — a community-driven open-source vector-tile renderer that supports WebGL, dynamic styling, smooth animation, and 3D without per-traffic licence cost.[^5] |
| Tile generation uses Tippecanoe (Felt's actively-maintained fork) for converting GeoJSON to MBTiles or PMTiles, reducing file size by 85–95% over raw GeoJSON at POI dataset scale. Tile serving uses Martin, the MapLibre Foundation's Rust tile server, supporting PostGIS, MBTiles, and PMTiles.[^6] | Tile generation uses Tippecanoe (Felt's actively-maintained fork) for converting GeoJSON to MBTiles or PMTiles, reducing file size by 85–95% over raw GeoJSON at POI dataset scale. Tile serving uses Martin, the MapLibre Foundation's Rust tile server, supporting PostGIS, MBTiles, and PMTiles.[^6] |
| The tile archive format is PMTiles — a single-file archive with HTTP range-request support, enabling tile serving directly from nginx without running Martin when the tiles are pre-baked.[^7] Martin is used when dynamic tile generation is needed (viewport-filtered density layers that respond to real-time filter state). | The tile archive format is PMTiles — a single-file archive with HTTP range-request support, enabling tile serving directly from nginx without running Martin when the tiles are pre-baked.[^7] Martin is used when dynamic tile generation is needed (viewport-filtered density layers that respond to real-time filter state). |
| For data-visualisation overlays (scatter, heatmap, arc, polygon-extrusion layers), deck.gl composes naturally with MapLibre. | For data-visualisation overlays (scatter, heatmap, arc, polygon-extrusion layers), deck.gl composes naturally with MapLibre. |
| ## Service schema — service-business, service-places, service-parking | ## Service schema — service-business, service-places, service-parking |
| A single record shape covers all three Ring 1 location services with discriminator fields: | A single record shape covers all three Ring 1 location services with discriminator fields: |
| ```jsonc | ```jsonc |
| { | { |
| "id": "01HZ...", // ULID | "id": "01HZ...", // ULID |
| "service": "business" | "places" | "parking", | "service": "business" | "places" | "parking", |
| "operator": "walmart", // brand slug | "operator": "walmart", // brand slug |
| "operator_brand_family": "walmart", // unifies regional equivalents | "operator_brand_family": "walmart", // unifies regional equivalents |
| "name": "Walmart Supercenter Burnaby", | "name": "Walmart Supercenter Burnaby", |
| "country_code": "US" | "CA" | "MX" | "ES", | "country_code": "US" | "CA" | "MX" | "ES", |
| "address": "...", | "address": "...", |
| "lat": 49.2827, | "lat": 49.2827, |
| "lng": -123.1207, | "lng": -123.1207, |
| "geometry": { "type": "Point", ... }, | "geometry": { "type": "Point", ... }, |
| "store_type": "supercenter" | "warehouse" | "diy" | "warehouse-club", | "store_type": "supercenter" | "warehouse" | "diy" | "warehouse-club", |
| "data_source": "official-store-locator" | "openstreetmap" | "overture" | "foursquare-os" | "manual", | "data_source": "official-store-locator" | "openstreetmap" | "overture" | "foursquare-os" | "manual", |
| "captured_at": "2026-04-30T00:00:00Z" | "captured_at": "2026-04-30T00:00:00Z" |
| } | } |
| ``` | ``` |
| Brand-family normalisation lets co-location queries treat regional equivalents as one logical operator across countries. `service-places` carries a `place_type` field (hospital, higher-education, airport). `service-parking` carries a Polygon `geometry` (the lot geofence) rather than a Point, plus an `associated_business_id` linking the lot to its anchor business when known. | Brand-family normalisation lets co-location queries treat regional equivalents as one logical operator across countries. `service-places` carries a `place_type` field (hospital, higher-education, airport). `service-parking` carries a Polygon `geometry` (the lot geofence) rather than a Point, plus an `associated_business_id` linking the lot to its anchor business when known. |
| ## Co-location analysis algorithm | ## Co-location analysis algorithm |
| The co-location query identifies locations from brand family A within 1 km, 2 km, and 3 km of locations from brand family B (and optionally a third family). Algorithm: | The co-location query identifies locations from brand family A within 1 km, 2 km, and 3 km of locations from brand family B (and optionally a third family). Algorithm: |
| 1. Iterate every record in brand family A. | 1. Iterate every record in brand family A. |
| 2. For each, find the nearest record in each other brand family using a haversine distance against an in-memory R-tree index. At tens of thousands of records, each lookup runs in microseconds. | 2. For each, find the nearest record in each other brand family using a haversine distance against an in-memory R-tree index. At tens of thousands of records, each lookup runs in microseconds. |
| 3. Bucket each multi-brand tuple by the maximum pairwise distance: `<1 km`, `1–2 km`, `2–3 km`, `>3 km`. | 3. Bucket each multi-brand tuple by the maximum pairwise distance: `<1 km`, `1–2 km`, `2–3 km`, `>3 km`. |
| 4. Emit a GeoJSON FeatureCollection: tuple centroid, triangle polyline connecting the locations, radius circles, and a `cluster_grade` property. | 4. Emit a GeoJSON FeatureCollection: tuple centroid, triangle polyline connecting the locations, radius circles, and a `cluster_grade` property. |
| Browser visualisation layers: POIs as circles coloured by brand family (Layer 1); co-location tuples with their radius haloes, toggled by grade (Layer 2); country boundaries and filter chips (Layer 3). Hover popovers show brand, format, year opened, distance to nearest co-located neighbours, and cluster grade. | Browser visualisation layers: POIs as circles coloured by brand family (Layer 1); co-location tuples with their radius haloes, toggled by grade (Layer 2); country boundaries and filter chips (Layer 3). Hover popovers show brand, format, year opened, distance to nearest co-located neighbours, and cluster grade. |
| At 15,000 POI records (combined coverage across four countries and three brand families), client-side rendering in MapLibre is well within comfortable operating range. Supercluster client-side clustering becomes relevant at approximately 50,000 records; server-side vector tile generation at 500,000+. | At 15,000 POI records (combined coverage across four countries and three brand families), client-side rendering in MapLibre is well within comfortable operating range. Supercluster client-side clustering becomes relevant at approximately 50,000 records; server-side vector tile generation at 500,000+. |
| ## Retail co-location research basis | ## Retail co-location research basis |
| Retail co-location clustering is a documented phenomenon with academic precedent: major retail anchor categories exhibit strong tendencies toward mutual proximity. Costco entry effects on neighbouring retailers have been studied formally. The co-location analysis the substrate produces maps directly to established methodology (average distance to nearest neighbour, tested against a permutation null distribution). | Retail co-location clustering is a documented phenomenon with academic precedent: major retail anchor categories exhibit strong tendencies toward mutual proximity. Costco entry effects on neighbouring retailers have been studied formally. The co-location analysis the substrate produces maps directly to established methodology (average distance to nearest neighbour, tested against a permutation null distribution). |
| ## Composition with the rest of the platform | ## Composition with the rest of the platform |
| The co-location triples produced by the location intelligence substrate compose with the rest of the platform substrate: a retail catchment polygon from the GIS layer and a building envelope from the BIM layer can share the same coordinate frame, the same per-element YAML sidecars, and the same WORM ledger anchoring. Two clusters; one substrate. | The co-location triples produced by the location intelligence substrate compose with the rest of the platform substrate: a retail catchment polygon from the GIS layer and a building envelope from the BIM layer can share the same coordinate frame, the same per-element YAML sidecars, and the same WORM ledger anchoring. Two clusters; one substrate. |
| `service-slm` is available for routine annotation work (suggesting categories for newly-ingested POIs, summarising dataset deltas, labelling anomalies) but the platform is fully functional with the Doorman shut down — the Optional Intelligence principle applied to geographic data. This is by design: GIS analysis does not require AI; AI is additive. | `service-slm` is available for routine annotation work (suggesting categories for newly-ingested POIs, summarising dataset deltas, labelling anomalies) but the platform is fully functional with the Doorman shut down — the Optional Intelligence principle applied to geographic data. This is by design: GIS analysis does not require AI; AI is additive. |
| ## Data sourcing | ## Data sourcing |
| Apache 2.0-licensed open datasets are the primary substrate: | Apache 2.0-licensed open datasets are the primary substrate: |
| - **Foursquare Open Source Places** — 100M+ POIs, monthly Parquet drops.[^2] | - **Foursquare Open Source Places** — 100M+ POIs, monthly Parquet drops.[^2] |
| - **Overture Maps Foundation** — places, buildings, transportation, and addresses as GeoParquet.[^1] | - **Overture Maps Foundation** — places, buildings, transportation, and addresses as GeoParquet.[^1] |
| - **OpenStreetMap** (via Nominatim or Photon geocoder) — secondary source for coverage gaps. | - **OpenStreetMap** (via Nominatim or Photon geocoder) — secondary source for coverage gaps. |
| Direct scraping of retailer websites is not used where terms of service prohibit data mining. The open-data foundations have already accumulated the POI records that would otherwise require scraping. | Direct scraping of retailer websites is not used where terms of service prohibit data mining. The open-data foundations have already accumulated the POI records that would otherwise require scraping. |
| ## Forward-looking information | ## Forward-looking information |
| Statements regarding deployment schedule, customer outcomes, and feature roadmap for the Location Intelligence Substrate are intended targets subject to change. Actual timelines depend on operator review at each stage, open-data coverage accuracy, and development velocity. These statements carry "planned"/"intended"/"may" framing per the workspace's continuous-disclosure posture.[^8][^9] | Statements regarding deployment schedule, customer outcomes, and feature roadmap for the Location Intelligence Substrate are intended targets subject to change. Actual timelines depend on operator review at each stage, open-data coverage accuracy, and development velocity. These statements carry "planned"/"intended"/"may" framing per the workspace's continuous-disclosure posture.[^8][^9] |
| ## See also | ## See also |
| - [[three-ring-architecture]] — `service-business`, `service-places`, and `service-parking` are Ring 1 services | - [[three-ring-architecture]] — `service-business`, `service-places`, and `service-parking` are Ring 1 services |
| - [[substrate-without-inference-base-case]] — GIS substrate functions fully without the AI ring | - [[substrate-without-inference-base-case]] — GIS substrate functions fully without the AI ring |
| - [[customer-owned-graph-ip]] — geographic datasets owned by the customer, not the vendor | - [[customer-owned-graph-ip]] — geographic datasets owned by the customer, not the vendor |