docs/Privacy & Data/Data Lineage

Data Lineage

Source tracking on every version — cli, api, mcp. SHA-256 for integrity.

Every row knows where it came from

Each push is stored as a VersionEntry — an immutable record that captures who pushed, when, how many rows, and which code path created it.

{
  "v":         42,
  "timestamp": "2026-05-07T09:14:22.000Z",
  "rows":      12408,
  "r2_key":    "grids/raihan/sales-q2/v42.jsonl",
  "source":    "cli",
  "sha256":    "e3b0c44298fc1c14…"
}

Source values

  • cliinstadash push from a terminal or CI script
  • api — Direct HTTP POST to /ingest
  • mcpinstadash_push tool call via MCP
  • webhook — Ingest triggered by an inbound webhook (planned)

SHA-256 integrity

The sha256 field is the hex digest of the raw JSONL blob before it was stored. You can verify data integrity at any point — re-hash the R2 object and compare.

# Fetch a specific version and verify
curl https://instadash.io/<handle>/sales-q2/versions
# → [..., { "v": 42, "sha256": "e3b0c44…", … }]
 
# Check locally
shasum -a 256 ./sales-q2-v42.jsonl

Lineage in LLM surfaces

The /llms.md endpoint includes an attribution block at the bottom with version, source, timestamp, and sha256 — so any LLM reading that surface knows exactly how fresh the data is and who pushed it.

curl https://instadash.io/raihan/sales-q2/llms.md
# → | id | region | amount |
# → | …  | …      | …      |
# → ---
# → attribution: { "v": 42, "source": "cli", "timestamp": "2026-05-07…", "sha256": "e3b0c44…" }