Cortex · Part 3

No vector DB. No embeddings. No RAG.

Tell anyone you're building memory for an AI agent and they reach for a vector database. I didn't. Markdown files, a git repo, and grep — and here's why.

4 min read read on linkedin
No vector DB. No embeddings. No RAG.

Part 3 of a series on giving AI coding agents shared organizational memory. Previously: Part 1 — Your AI coding agent has amnesia · Part 2 — I was measuring the wrong thing.

Tell someone you’re building memory for an AI coding agent and watch what they say. “So, embeddings, right? What vector DB?” Say memory, everyone reaches for RAG.

I didn’t. Markdown files, a git repo, and grep. Nothing you don’t already have installed. I called it Cortex. Half expected it to bite me, because it sounds insufficient. It doesn’t.

The interesting question is why. Walking the landscape first.

Static project files

CLAUDE.md. Or the auto-generated MEMORY.md if you’re on a newer Claude Code. Per-project, capped around 200 lines before adherence drops. Touch three repos in a day, get three disconnected context files. Three separate Post-its is not org memory.

Where the serious competition lives. claude-mem is the headliner. Automatic session capture, AI compression, local SQLite and vector retrieval. Real engineering, deserved popularity.

Two things make it the wrong shape for what I wanted.

Philosophical first. It captures what happened, not what is. Part 2 showed the correctness gap isn’t missing sessions. It’s missing decisions. A distilled note that says “we use httpx against a running container, not TestClient, because we need container startup assertions” is denser, more durable, and more retrievable than the tool-call trace that led someone there. Compression can reconstruct events. It can’t reconstruct consensus.

Operational second. claude-mem has a folder-context feature (opt-in) that writes a CLAUDE.md into every project subdirectory it observes. Recent Activity sections, observation IDs, token counts. Which leaves you two choices: commit those files and pollute every repo with machine-generated logs, or gitignore them and your memory is per-machine instead of team-shared. The opt-in status doesn’t change the tradeoff, just who opted in.

Supermemory and similar. Solve the hosting problem in exchange for a subscription and a cloud dependency. Different failure mode, same outcome: your org’s memory lives somewhere that isn’t your org.

Self-hosted memory infrastructure

Mem0, Obsidian plus vector DB. Technically the most sophisticated. Operationally a non-starter. A tool 20% better that requires Postgres, Neo4j, and an embedding pipeline gets skipped. Adoption is a one-way ratchet. You don’t get a second install.

Memory system landscape: static files, session loggers, paid SaaS, self-hosted infra, Cortex

None of these gave me what I wanted: team-shareable, low-friction install, durability over compression. So Cortex went simpler.

What Cortex is, instead

Markdown files, YAML frontmatter, git, grep. That’s the stack.

Three things make it work.

Grep on frontmatter is precise. Every note carries multi-dimensional tags: service, tool, pattern, team, domain. One ripgrep call returns every decision involving Datadog. Pipe it to another and you get the intersection with a monitoring pattern. Structured queries, not similarity guesses. Good metadata does what embeddings would, deterministically, and the query is something anyone on the team can read.

A human decides what to remember, at write-time. /memorize distills a session into a structured note. Most of the session doesn’t survive the distill, and that’s the feature. claude-mem inverts this: capture everything, trust the retriever. Different problems, different answers. For architectural memory, curation beats compression.

Git gives you the rest for free. Authorship, versioning, PRs, review, blame. The curator agent (later post) reviews memory PRs like any other code change. Because that’s exactly what they are. When scale eventually breaks grep, SQLite FTS5 sits next to the markdown. No rewrite.

Postscript

Karpathy posted about flat-markdown LLM knowledge bases the day before our hackathon demo. No RAG at small scale, agent-maintained wiki. Same architecture, different angle. Interesting part isn’t the validation. It’s that two people coming at this from opposite directions landed in the same place within 24 hours of each other. Suggests the place isn’t accidental.

Sticks and stones beat infrastructure at this scale. Two files run everything. Next post: what’s in them, and why they’re the constitution the rest answers to.