Base URL: https://api.northcontext.dev · All requests use Authorization: Bearer <key>.
preference, fact, goal, event, or relationship.Run extraction without storing. Useful for previews.
POST /v1/extract
{
"text": "Just got off a call with Priya..."
}
→ 200 OK
{
"extracted": [
{ "type": "fact", "content": "Priya is a senior PM at Acme." },
{ "type": "preference", "content": "Prefers Slack over email." },
{ "type": "goal", "content": "Shipping new billing system by Q3." }
]
}
Extract, embed, and store. The default end-to-end write.
POST /v1/memories
{
"text": "...", // raw text → extracted into multiple memories, OR
"content": "...", // pre-formed memory
"type": "fact", // optional, required if using "content"
"end_user_id": "user_42",
"metadata": { "source": "support" } // optional, free-form
}
List memories. Supports filters.
GET /v1/memories?end_user_id=user_42&type=preference&limit=50
Semantic search. Returns memories ranked by cosine similarity.
POST /v1/recall
{
"query": "what does the user think about email",
"end_user_id": "user_42", // optional but strongly recommended
"top_k": 5,
"types": ["preference", "fact"] // optional filter
}
→ 200 OK
{
"matches": [
{ "id": "mem_...", "content": "...", "type": "preference", "score": 0.84 },
...
]
}
All memories are embedded with Cloudflare's @cf/baai/bge-base-en-v1.5
(768 dim, cosine). Vectors live in Vectorize, namespaced per workspace.
Structured rows (type, content, metadata, audit fields) live in D1.
Raw source text goes to R2 for replay and audit.
Every API key belongs to one workspace. All queries are filtered by
workspace_id at the storage layer. You cannot see another
workspace's data even by id, even by mistake.
A nightly worker scans recent memories, finds candidate duplicates and contradictions via vector similarity, and asks an LLM to merge or supersede. Each merge writes an audit row.
Email / phone / card patterns are flagged on write. POST /v1/users/:id/forget
removes everything for an end-user across D1, Vectorize, and R2 within a
single API call (synchronous on the active path; deletion in R2 may lag a few seconds).
Authorization header.X-RateLimit-Remaining.idempotency_key.Self-hosting? See the README in the repo for Cloudflare deployment in < 10 minutes.