API reference

Base URL: https://api.northcontext.dev · All requests use Authorization: Bearer <key>.

Concepts

workspace
your tenant. all data scoped here. created on signup.
end_user_id
your customer's user id. opaque string. memories are usually scoped per end-user.
memory
a single typed fact: preference, fact, goal, event, or relationship.
recall
semantic search over memories. returns ranked matches, ready to inject into your prompt.

POST /v1/extract

Run extraction without storing. Useful for previews.

POST /v1/extract
{
  "text": "Just got off a call with Priya..."
}

→ 200 OK
{
  "extracted": [
    { "type": "fact",       "content": "Priya is a senior PM at Acme." },
    { "type": "preference", "content": "Prefers Slack over email." },
    { "type": "goal",       "content": "Shipping new billing system by Q3." }
  ]
}

POST /v1/memories

Extract, embed, and store. The default end-to-end write.

POST /v1/memories
{
  "text": "...",                       // raw text → extracted into multiple memories, OR
  "content": "...",                    // pre-formed memory
  "type": "fact",                      // optional, required if using "content"
  "end_user_id": "user_42",
  "metadata": { "source": "support" }  // optional, free-form
}

GET /v1/memories

List memories. Supports filters.

GET /v1/memories?end_user_id=user_42&type=preference&limit=50

DELETE /v1/memories/:id

POST /v1/recall

Semantic search. Returns memories ranked by cosine similarity.

POST /v1/recall
{
  "query": "what does the user think about email",
  "end_user_id": "user_42",   // optional but strongly recommended
  "top_k": 5,
  "types": ["preference", "fact"]   // optional filter
}

→ 200 OK
{
  "matches": [
    { "id": "mem_...", "content": "...", "type": "preference", "score": 0.84 },
    ...
  ]
}

Embeddings & storage

All memories are embedded with Cloudflare's @cf/baai/bge-base-en-v1.5 (768 dim, cosine). Vectors live in Vectorize, namespaced per workspace. Structured rows (type, content, metadata, audit fields) live in D1. Raw source text goes to R2 for replay and audit.

Multi-tenancy

Every API key belongs to one workspace. All queries are filtered by workspace_id at the storage layer. You cannot see another workspace's data even by id, even by mistake.

Consolidation

A nightly worker scans recent memories, finds candidate duplicates and contradictions via vector similarity, and asks an LLM to merge or supersede. Each merge writes an audit row.

Privacy

Email / phone / card patterns are flagged on write. POST /v1/users/:id/forget removes everything for an end-user across D1, Vectorize, and R2 within a single API call (synchronous on the active path; deletion in R2 may lag a few seconds).

Errors

401
missing or invalid Authorization header.
403
key revoked, or attempted cross-tenant access.
429
rate limited. see X-RateLimit-Remaining.
5xx
we're crying. retries are safe; all writes are idempotent on idempotency_key.

Self-hosting? See the README in the repo for Cloudflare deployment in < 10 minutes.