Go to file

Eric Turner 4c6b7609a1 docs: reframe as extensions + replace Signal & Noise artifact

Two changes, one commit:

1. Reframe "weaknesses" as "extensions memex adds":
   Karpathy's gist is a concept pitch, not an implementation. Reframe
   the seven places memex extends the pattern as engineering-layer
   additions rather than problems to fix. Cleaner narrative — memex
   builds on Karpathy's work instead of critiquing it.

   Touches README.md (Why each part exists + Credits) and
   DESIGN-RATIONALE.md (section titles, trade-off framing, biggest
   layer section, scope note at the end).

2. Replace docs/artifacts/signal-and-noise.html with the full
   upstream version:
   The earlier abbreviated copy dropped the MemPalace integration tab,
   the detailed mitigation steps with effort pips, the impact
   before/after cards, and the qmd vs ChromaDB comparison. This
   restores all of that. Also swaps self-references from "LLM Wiki"
   to "memex" while leaving external "LLM Wiki v2" community
   citations alone (those refer to a separate pattern and aren't ours
   to rename).

The live hosted copy at eric-turner.com/memex/signal-and-noise.html
has already been updated via scp — Hugo picks up static changes with
--poll 1s so the public URL reflects this file immediately.

2026-04-12 22:01:31 -06:00

docs

docs: reframe as extensions + replace Signal & Noise artifact

2026-04-12 22:01:31 -06:00

scripts

Initial commit — memex

2026-04-12 21:16:02 -06:00

tests

Initial commit — memex

2026-04-12 21:16:02 -06:00

.gitignore

Initial commit — memex

2026-04-12 21:16:02 -06:00

config.example.yaml

Initial commit — memex

2026-04-12 21:16:02 -06:00

LICENSE

Initial commit — memex

2026-04-12 21:16:02 -06:00

README.md

docs: reframe as extensions + replace Signal & Noise artifact

2026-04-12 22:01:31 -06:00

README.md

memex — Compounding Knowledge for AI Agents

A persistent, LLM-maintained knowledge base that sits between you and the sources it was compiled from. Unlike RAG — which re-discovers the same answers on every query — memex gets richer over time. Facts get cross-referenced, contradictions get flagged, stale advice ages out and gets archived, and new knowledge discovered during a session gets written back so it's there next time.

The agent reads the wiki at the start of every session and updates it as new things are learned. The wiki is the long-term memory; the session is the working memory.

Why "memex"? In 1945, Vannevar Bush wrote As We May Think describing a hypothetical machine called the memex (a portmanteau of "memory" and "index") that would store and cross-reference a person's entire library of books, records, and communications, with "associative trails" linking related ideas. He imagined someone using it would build up a personal knowledge web over a lifetime, and that the trails themselves — the network of learned associations — were more valuable than any individual document.

Eighty years later, LLMs make the memex finally buildable. The "associative trails" Bush imagined are the related: frontmatter fields and wikilinks the agent maintains. This repo is one attempt at that.

Inspiration: memex combines Andrej Karpathy's persistent-wiki gist and milla-jovovich/mempalace, and adds an automation layer on top so the wiki maintains itself.

The problem with stateless RAG

Most people's experience with LLMs and documents looks like RAG: you upload files, the LLM retrieves chunks at query time, generates an answer, done. This works — but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation.

Ask the same subtle question twice and the LLM does all the same work twice. Ask something that requires synthesizing five documents and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

Worse, raw sources go stale. URLs rot. Documentation lags. Blog posts get retracted. If your knowledge base is "the original documents," stale advice keeps showing up alongside current advice and there's no way to know which is which.

The core idea — a compounding wiki

Instead of retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources.

When a new source shows up (a doc page, a blog post, a CLI --help, a conversation transcript), the LLM doesn't just index it. It reads it, extracts what's load-bearing, and integrates it into the existing wiki — updating topic pages, revising summaries, noting where new data contradicts old claims, strengthening or challenging the evolving synthesis. The knowledge is compiled once and then kept current, not re-derived on every query.

This is the key difference: the wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything the LLM has read. The wiki gets richer with every source added and every question asked.

You never (or rarely) write the wiki yourself. The LLM writes and maintains all of it. You're in charge of sourcing, exploration, and asking the right questions. The LLM does the summarizing, cross-referencing, filing, and bookkeeping that make a knowledge base actually useful over time.

What this adds beyond Karpathy's gist

Karpathy's gist describes the idea — a wiki the agent maintains. This repo is a working implementation with an automation layer that handles the lifecycle of knowledge, not just its creation:

Layer	What it does
Conversation mining	Extracts Claude Code session transcripts into searchable markdown. Summarizes them via `claude -p` with model routing (haiku for short sessions, sonnet for long ones). Links summaries to wiki pages by topic.
URL harvesting	Scans summarized conversations for external reference URLs. Fetches them via `trafilatura` → `crawl4ai` → stealth mode cascade. Compiles clean markdown into pending wiki pages.
Human-in-the-loop staging	Automated content lands in `staging/` with `status: pending`. You review via CLI, interactive prompts, or an in-session Claude review. Nothing automated goes live without approval.
Staleness decay	Every page tracks `last_verified`. After 6 months without a refresh signal, confidence decays `high → medium`; 9 months → `low`; 12 months → `stale` → auto-archived.
Auto-restoration	Archived pages that get referenced again in new conversations or wiki updates are automatically restored.
Hygiene	Daily structural checks (orphans, broken cross-refs, index drift, frontmatter repair). Weekly LLM-powered checks (duplicates, contradictions, missing cross-references).
Orchestrator	One script chains all of the above into a daily cron-able pipeline.

The result: you don't have to maintain the wiki. You just use it. The automation handles harvesting new knowledge, retiring old knowledge, keeping cross-references intact, and flagging ambiguity for review.

How memex extends Karpathy's pattern

Before implementing anything, the design was worked out interactively with Claude as a structured Signal & Noise analysis. Karpathy's original gist is a concept pitch, not an implementation — he was explicit that he was sharing an "idea file" for others to build on. memex is one attempt at that build-out. The analysis identified seven places where the core idea needed an engineering layer to become practical day-to-day, and every automation component in this repo maps to one of those extensions:

What memex adds	How it works
Time-decaying confidence — pages earn trust through reinforcement and fade without it	`confidence` field + `last_verified`, 6/9/12 month decay thresholds, auto-archive. Full-mode hygiene also adds LLM contradiction detection across pages.
Scalable search beyond the context window	`qmd` (BM25 + vector + LLM re-ranking) from day one, with three collections (`wiki` / `wiki-archive` / `wiki-conversations`) so queries can route to the right surface.
Traceable sources for every claim	Every compiled page traces back to an immutable `raw/harvested/*.md` file with a SHA-256 content hash. Staging review is the built-in cross-check, and `compilation_notes` makes review fast.
Continuous feed without manual discipline	Daily + weekly cron chains extract → summarize → harvest → hygiene → reindex. `last_verified` auto-refreshes from new conversation references; decayed pages auto-archive and auto-restore when referenced again.
Human-in-the-loop staging for automated content	Every automated page lands in `staging/` first with `origin: automated`, `status: pending`. Nothing bypasses human review — one promotion step and it's in the live wiki with `last_verified` set.
Hybrid retrieval — structural navigation + semantic search	Wings/rooms/halls (borrowed from mempalace) give structural filtering that narrows the search space before qmd's hybrid BM25 + vector pass runs. Full-mode hygiene also auto-adds missing cross-references.
Cross-machine git sync for collaborative knowledge bases	`.gitattributes` with `merge=union` on markdown so concurrent writes on different machines merge additively. Harvest and hygiene state files sync across machines so both agree on what's been processed.

The short version: Karpathy shared the idea, milla-jovovich's mempalace added the structural memory taxonomy, and memex is the automation layer that lets the whole thing run day-to-day without constant human maintenance. See docs/DESIGN-RATIONALE.md for the longer rationale on each extension, plus honest notes on what memex doesn't cover.

Compounding loop

┌─────────────────────┐
│  Claude Code        │
│  sessions (.jsonl)  │
└──────────┬──────────┘
           │ extract-sessions.py (hourly, no LLM)
           ▼
┌─────────────────────┐
│  conversations/     │  markdown transcripts
│  <project>/*.md     │  (status: extracted)
└──────────┬──────────┘
           │ summarize-conversations.py --claude (daily)
           ▼
┌─────────────────────┐
│  conversations/     │  summaries with related: wiki links
│  <project>/*.md     │  (status: summarized)
└──────────┬──────────┘
           │ wiki-harvest.py (daily)
           ▼
┌─────────────────────┐
│  raw/harvested/     │  fetched URL content
│  *.md               │  (immutable source material)
└──────────┬──────────┘
           │ claude -p compile step
           ▼
┌─────────────────────┐
│  staging/<type>/    │  pending pages
│  *.md               │  (status: pending, origin: automated)
└──────────┬──────────┘
           │ human review (wiki-staging.py --review)
           ▼
┌─────────────────────┐
│  patterns/          │  LIVE wiki
│  decisions/         │  (origin: manual or promoted-from-automated)
│  concepts/          │
│  environments/      │
└──────────┬──────────┘
           │ wiki-hygiene.py (daily quick / weekly full)
           │ - refresh last_verified from new conversations
           │ - decay confidence on idle pages
           │ - auto-restore archived pages referenced again
           │ - fuzzy-fix broken cross-references
           ▼
┌─────────────────────┐
│  archive/<type>/    │  stale/superseded content
│  *.md               │  (excluded from default search)
└─────────────────────┘

Every arrow is automated. The only human step is staging review — and that's quick because the AI compilation step already wrote the page, you just approve or reject.

Quick start — two paths

Path A: just the idea (Karpathy-style)

Open a Claude Code session in an empty directory and tell it:

I want you to start maintaining a persistent knowledge wiki for me.
Create a directory structure with patterns/, decisions/, concepts/, and
environments/ subdirectories. Each page should have YAML frontmatter with
title, type, confidence, sources, related, last_compiled, and last_verified
fields. Create an index.md at the root that catalogs every page.

From now on, when I share a source (a doc page, a CLI --help, a conversation
I had), read it, extract what's load-bearing, and integrate it into the
wiki. Update existing pages when new knowledge refines them. Flag
contradictions between pages. Create new pages when topics aren't
covered yet. Update index.md every time you create or remove a page.

When I ask a question, read the relevant wiki pages first, then answer.
If you rely on a wiki page with `confidence: low`, flag that to me.

That's the whole idea. The agent will build you a growing markdown tree that compounds over time. This is the minimum viable version.

Path B: the full automation (this repo)

git clone <this-repo> ~/projects/wiki
cd ~/projects/wiki

# Install the Python extraction tools
pipx install trafilatura
pipx install crawl4ai && crawl4ai-setup

# Install qmd for full-text + vector search
npm install -g @tobilu/qmd

# Configure qmd (3 collections — see docs/SETUP.md for the YAML)
# Edit scripts/extract-sessions.py with your project codes
# Edit scripts/update-conversation-index.py with matching display names

# Copy the example CLAUDE.md files (wiki schema + global instructions)
cp docs/examples/wiki-CLAUDE.md CLAUDE.md
cat docs/examples/global-CLAUDE.md >> ~/.claude/CLAUDE.md
# edit both for your conventions

# Run the full pipeline once, manually
bash scripts/mine-conversations.sh --extract-only     # Fast, no LLM
python3 scripts/summarize-conversations.py --claude   # Classify + summarize
python3 scripts/update-conversation-index.py --reindex

# Then maintain
bash scripts/wiki-maintain.sh                         # Daily hygiene
bash scripts/wiki-maintain.sh --hygiene-only --full   # Weekly deep pass

See docs/SETUP.md for complete setup including qmd configuration (three collections: wiki, wiki-archive, wiki-conversations), optional cron schedules, git sync, and the post-merge hook. See docs/examples/ for starter CLAUDE.md files (wiki schema + global instructions) with explicit guidance on using the three qmd collections.

Directory layout after setup

wiki/
├── CLAUDE.md                  ← Schema + instructions the agent reads every session
├── index.md                   ← Content catalog (the agent reads this first)
├── patterns/                  ← HOW things should be built (LIVE)
├── decisions/                 ← WHY we chose this approach (LIVE)
├── concepts/                  ← WHAT the foundational ideas are (LIVE)
├── environments/              ← WHERE implementations differ (LIVE)
├── staging/                   ← PENDING automated content awaiting review
│   ├── index.md
│   └── <type>/
├── archive/                   ← STALE / superseded (excluded from search)
│   ├── index.md
│   └── <type>/
├── raw/                       ← Immutable source material (never modified)
│   ├── <topic>/
│   └── harvested/             ← URL harvester output
├── conversations/             ← Mined Claude Code session transcripts
│   ├── index.md
│   └── <project>/
├── context/                   ← Auto-updated AI session briefing
│   ├── wake-up.md             ← Loaded at the start of every session
│   └── active-concerns.md     ← Current blockers and focus areas
├── reports/                   ← Hygiene operation logs
├── scripts/                   ← The automation pipeline
├── tests/                     ← Pytest suite (171 tests)
├── .harvest-state.json        ← URL dedup state (committed, synced)
├── .hygiene-state.json        ← Content hashes, deferred issues (committed, synced)
└── .mine-state.json           ← Conversation extraction offsets (gitignored, per-machine)

What's Claude-specific (and what isn't)

This repo is built around Claude Code as the agent. Specifically:

Session mining expects ~/.claude/projects/<hashed-path>/*.jsonl files written by the Claude Code CLI. Other agents won't produce these.
Summarization uses claude -p (the Claude Code CLI's one-shot mode) with haiku/sonnet routing by conversation length. Other LLM CLIs would need a different wrapper.
URL compilation uses claude -p to turn raw harvested content into a wiki page with proper frontmatter.
The agent itself (the thing that reads CLAUDE.md and maintains the wiki conversationally) is Claude Code. Any agent that reads markdown and can write files could do this job — CLAUDE.md is just a text file telling the agent what the wiki's conventions are.

What's NOT Claude-specific:

The wiki schema (frontmatter, directory layout, lifecycle states)
The staleness decay model and archive/restore semantics
The human-in-the-loop staging workflow
The hygiene checks (orphans, broken cross-refs, duplicates)
The trafilatura + crawl4ai URL fetching
The qmd search integration
The git-based cross-machine sync

If you use a different agent, you replace parts 1-4 above with equivalents for your agent. The other 80% of the repo is agent-agnostic. See docs/CUSTOMIZE.md for concrete adaptation recipes.

Architecture at a glance

Eleven scripts organized in three layers:

Mining layer (ingests conversations):

extract-sessions.py — Parse Claude Code JSONL → markdown transcripts
summarize-conversations.py — Classify + summarize via claude -p
update-conversation-index.py — Regenerate conversation index + wake-up context

Automation layer (maintains the wiki):

wiki_lib.py — Shared frontmatter parser, WikiPage dataclass, constants
wiki-harvest.py — URL classification + fetch cascade + compile to staging
wiki-staging.py — Human review (list/promote/reject/review/sync)
wiki-hygiene.py — Quick + full hygiene checks, archival, auto-restore
wiki-maintain.sh — Top-level orchestrator chaining harvest + hygiene

Sync layer:

wiki-sync.sh — Git commit/pull/push with merge-union markdown handling
mine-conversations.sh — Mining orchestrator

See docs/ARCHITECTURE.md for a deeper tour.

Why markdown, not a real database?

Markdown files are:

Human-readable without any tooling — you can browse in Obsidian, VS Code, or cat
Git-native — full history, branching, rollback, cross-machine sync for free
Agent-friendly — every LLM was trained on markdown, so reading and writing it is free
Durable — no schema migrations, no database corruption, no vendor lock-in
Interoperable — Obsidian graph view, grep, qmd, ripgrep, any editor

A SQLite file with the same content would be faster to query but harder to browse, harder to merge, harder to audit, and fundamentally less collaborative between you and the agent. Markdown wins for knowledge management what Postgres wins for transactions.

Testing

Full pytest suite in tests/ — 171 tests across all scripts, runs in ~1.3 seconds, no network or LLM calls needed, works on macOS and Linux/WSL.

cd tests && python3 -m pytest
# or
bash tests/run.sh

The test suite uses a disposable tmp_wiki fixture so no test ever touches your real wiki.

Credits and inspiration

This repo is a synthesis of two existing ideas with an automation layer on top. It would not exist without either of them.

Core pattern — Andrej Karpathy — "Agent-Maintained Persistent Wiki" gist The foundational idea of a compounding LLM-maintained wiki that moves synthesis from query-time (RAG) to ingest-time. memex is an implementation of Karpathy's pattern with the engineering layer that turns the concept into something practical to run day-to-day.

Structural memory taxonomy — milla-jovovich/mempalace The wing/room/hall/closet/drawer/tunnel concepts that turn a flat corpus into something you can navigate without reading everything. See ARCHITECTURE.md#borrowed-concepts for the explicit mapping of MemPalace terms to this repo's implementation.

Search layer — qmd by Tobi Lütke (Shopify CEO). Local BM25 + vector + LLM re-ranking on markdown files. Chosen over ChromaDB because it uses the same storage format as the wiki — one index to maintain, not two. Explicitly recommended by Karpathy as well.

URL extraction stack — trafilatura for fast static-page extraction and crawl4ai for JS-rendered and anti-bot cases. The two-tool cascade handles essentially any web content without needing a full browser stack for simple pages.

The agent — Claude Code by Anthropic. The repo is Claude-specific (see the section above for what that means and how to adapt for other agents).

Design process — memex was designed interactively with Claude as a structured Signal & Noise analysis before any code was written. The analysis walks through the seven real strengths of Karpathy's pattern and seven places where it needs an engineering layer to be practical, and works through the concrete extension for each. Every component in this repo maps back to a specific extension identified there.

Live interactive version: eric-turner.com/memex/signal-and-noise.html — click tabs to explore pros/cons, vs RAG, use-case fits, signal breakdown, and mitigations
Self-contained archive in this repo: docs/artifacts/signal-and-noise.html — download and open locally; works offline
Condensed written version: docs/DESIGN-RATIONALE.md — every tradeoff and mitigation rendered as prose

License

MIT — see LICENSE.

Contributing

This is a personal project that I'm making public in case the pattern is useful to others. Issues and PRs welcome, but I make no promises about response time. If you fork and make it your own, I'd love to hear how you adapted it.

Description

A compounding LLM-maintained knowledge wiki. Synthesis of Andrej Karpathy's persistent-wiki pattern and milla-jovovich's mempalace, with an automation layer for conversation mining, URL harvesting, human-in-the-loop staging, staleness decay, and hygiene.

Readme MIT 371 KiB