Two changes, one commit: 1. Reframe "weaknesses" as "extensions memex adds": Karpathy's gist is a concept pitch, not an implementation. Reframe the seven places memex extends the pattern as engineering-layer additions rather than problems to fix. Cleaner narrative — memex builds on Karpathy's work instead of critiquing it. Touches README.md (Why each part exists + Credits) and DESIGN-RATIONALE.md (section titles, trade-off framing, biggest layer section, scope note at the end). 2. Replace docs/artifacts/signal-and-noise.html with the full upstream version: The earlier abbreviated copy dropped the MemPalace integration tab, the detailed mitigation steps with effort pips, the impact before/after cards, and the qmd vs ChromaDB comparison. This restores all of that. Also swaps self-references from "LLM Wiki" to "memex" while leaving external "LLM Wiki v2" community citations alone (those refer to a separate pattern and aren't ours to rename). The live hosted copy at eric-turner.com/memex/signal-and-noise.html has already been updated via scp — Hugo picks up static changes with --poll 1s so the public URL reflects this file immediately.
18 KiB
Design Rationale — Signal & Noise
Why each part of this repo exists. This is the "why" document; the other docs are the "what" and "how."
Before implementing anything, the design was worked out interactively with Claude as a structured Signal & Noise analysis of Andrej Karpathy's original persistent-wiki pattern:
Interactive version: eric-turner.com/memex/signal-and-noise.html — tabs for pros/cons, vs RAG, use-case fits, signal breakdown, mitigations
Self-contained archive:
artifacts/signal-and-noise.html— same content, works offline
The analysis walks through the pattern's seven genuine strengths, seven places where it needs an engineering layer to be practical, and the concrete extension for each. memex is the implementation of those extensions. If you want to understand why a component exists, the interactive version has the longer-form argument; this document is the condensed written version.
Where the pattern is genuinely strong
The analysis found seven strengths that hold up under scrutiny. This repo preserves all of them:
| Strength | How this repo keeps it |
|---|---|
| Knowledge compounds over time | Every ingest adds to the existing wiki rather than restarting; conversation mining and URL harvesting continuously feed new material in |
| Zero maintenance burden on humans | Cron-driven harvest + hygiene; the only manual step is staging review, and that's fast because the AI already compiled the page |
| Token-efficient at personal scale | index.md fits in context; qmd kicks in only at 50+ articles; the wake-up briefing is ~200 tokens |
| Human-readable & auditable | Plain markdown everywhere; every cross-reference is visible; git history shows every change |
| Future-proof & portable | No vendor lock-in; you can point any agent at the same tree tomorrow |
| Self-healing via lint passes | wiki-hygiene.py runs quick checks daily and full (LLM) checks weekly |
| Path to fine-tuning | Wiki pages are high-quality synthetic training data once purified through hygiene |
Where memex extends the pattern
Karpathy's gist is a concept pitch. He was explicit that he was sharing an "idea file" for others to build on, not publishing a working implementation. The analysis identified seven places where the core idea needs an engineering layer to become practical day-to-day — five have first-class answers in memex, and two remain scoped-out trade-offs that the architecture cleanly acknowledges.
1. Claim freshness and reversibility
The gap: Unlike RAG — where a hallucination is ephemeral and the next query starts clean — an LLM-maintained wiki is stateful. If a claim is wrong at ingest time, it stays wrong until something corrects it. For the pattern to work long-term, claims need a way to earn trust over time and lose it when unused.
How memex extends it:
confidencefield — every page carrieshigh/medium/lowwith decay based onlast_verified. Wrong claims aren't treated as permanent — they age out visibly.- Archive + restore — decayed pages get moved to
archive/where they're excluded from default search. If they get referenced again they're auto-restored withconfidence: medium(never straight tohigh— they have to re-earn trust). - Raw harvested material is immutable —
raw/harvested/*.mdfiles are the ground truth. Every compiled wiki page can be traced back to its source via thesources:frontmatter field. - Full-mode contradiction detection —
wiki-hygiene.py --fulluses sonnet to find conflicting claims across pages. Report-only (humans decide which side wins). - Staging review — automated content goes to
staging/first. Nothing enters the live wiki without human approval, so errors have two chances to get caught (AI compile + human review) before they become persistent.
2. Scalable search beyond the context window
The gap: The pattern works beautifully up to ~100 articles, where
index.md still fits in context. Karpathy's own wiki was right at the
ceiling. Past that point, the agent needs a real search layer — loading
the full index stops being practical.
How memex extends it:
qmdfrom day one —qmd(BM25 + vector + LLM re-ranking) is set up in the default configuration so the agent never has to load the full index. At 50+ pages,qmd searchreplacescat index.md.- Wing/room structural filtering — conversations are partitioned by
project code (wing) and topic (room, via the
topics:frontmatter). Retrieval is pre-narrowed to the relevant wing before search runs. This extends the effective ceiling becauseqmdworks on a relevant subset, not the whole corpus. - Hygiene full mode flags redundancy — duplicate detection auto-merges weaker pages into stronger ones, keeping the corpus lean.
- Archive excludes stale content — the
wiki-archivecollection hasincludeByDefault: false, so archived pages don't eat context until explicitly queried.
3. Traceable sources for every claim
The gap: In precision-sensitive domains (API specs, version constraints, legal records, medical protocols), LLM-generated content needs to be verifiable against a source. For the pattern to work in those contexts, every claim needs to trace back to something immutable.
How memex extends it:
- Staging workflow — every automated page goes through human review. For precision-critical content, that review IS the cross-check. The AI does the drafting; you verify.
compilation_notesfield — staging pages include the AI's own explanation of what it did and why. Makes review faster — you can spot-check the reasoning rather than re-reading the whole page.- Immutable raw sources — every wiki claim traces back to a specific
file in
raw/harvested/with a SHA-256content_hash. Verification means comparing the claim to the source, not "trust the LLM." confidence: lowfor precision domains — the agent's instructions (viaCLAUDE.md) tell it to flag low-confidence content when citing. Humans see the warning before acting.
Residual trade-off: For truly mission-critical data (legal, medical, compliance), no amount of automation replaces domain-expert review. If that's your use case, treat this repo as a drafting tool, not a canonical source.
4. Continuous feed without manual discipline
The gap: Community analysis of 120+ comments on Karpathy's gist converged on one clear finding: this is the #1 friction point. Most people who try the pattern get the folder structure right and still end up with a wiki that slowly becomes unreliable because they stop feeding it. Six-week half-life is typical.
How memex extends it (this is the biggest layer):
- Automation replaces human discipline — daily cron runs
wiki-maintain.sh(harvest + hygiene + qmd reindex); weekly cron runs--fullmode. You don't need to remember anything. - Conversation mining is the feed — you don't need to curate sources manually. Every Claude Code session becomes potential ingest. The feed is automatic and continuous, as long as you're doing work.
last_verifiedrefreshes from conversation references — when the summarizer links a conversation to a wiki page viarelated:, the hygiene script picks that up and bumpslast_verified. Pages stay fresh as long as they're still being discussed.- Decay thresholds force attention — pages without refresh signals for 6/9/12 months get downgraded and eventually archived. The wiki self-trims.
- Hygiene reports —
reports/hygiene-YYYY-MM-DD-needs-review.mdflags the things that do need human judgment. Everything else is auto-fixed.
This is the single biggest layer memex adds. Nothing about it is exotic — it's a cron-scheduled pipeline that runs the scripts you'd otherwise have to remember to run. That's the whole trick.
5. Keeping the human engaged with their own knowledge
The gap: Hacker News critics pointed out that the bookkeeping Karpathy outsources — filing, cross-referencing, summarizing — is precisely where genuine understanding forms. If the LLM does all of it, you can end up with a comprehensive wiki you haven't internalized. For the pattern to be an actual memory aid and not a false one, the human needs touchpoints that keep them engaged.
How memex extends it:
- Staging review is a forcing function — you see every automated page before it lands. Even skimming forces engagement with the material.
qmd query "..."for exploration — searching the wiki is an active process, not passive retrieval. You're asking questions, not pulling a file.- The wake-up briefing —
context/wake-up.mdis a 200-token digest the agent reads at session start. You read it too (or the agent reads it to you) — ongoing re-exposure to your own knowledge base.
Caveat: memex is designed as augmentation, not replacement. It's most valuable when you engage with it actively — reading your own wake-up briefing, spot-checking promoted pages, noticing decay flags. If you only consult the wiki through the agent and never look at it yourself, you've outsourced the learning. That's a usage pattern choice, not an architecture problem.
6. Hybrid retrieval — structure and semantics
The gap: Explicit wikilinks catch direct topic references but miss semantic neighbors that use different wording. At scale, the pattern benefits from vector similarity to find cross-topic connections the human (or the LLM at ingest time) didn't think to link manually.
How memex extends it:
qmdis hybrid (BM25 + vector) — not just keyword search. Vector similarity is built into the retrieval pipeline from day one.- Structural navigation complements semantic search — project codes (wings) and topic frontmatter narrow the search space before the hybrid search runs. Structure + semantics is stronger than either alone.
- Missing cross-reference detection — full-mode hygiene asks the LLM to find pages that should link to each other but don't, then auto-adds them. This is the explicit-linking approach catching up to semantic retrieval over time.
Residual trade-off: At enterprise scale (millions of documents), a proper vector DB with specialized retrieval wins. This repo is for personal / small-team scale where the hybrid approach is sufficient.
7. Cross-machine collaboration
The gap: Karpathy's gist describes a single-user, single-machine setup. In practice, people work from multiple machines (laptop, workstation, server) and sometimes collaborate with small teams. The pattern needs a sync story that handles concurrent writes gracefully.
How memex extends it:
- Git-based sync with merge-union — concurrent writes on different
machines auto-resolve because markdown is set to
merge=unionin.gitattributes. Both sides win. - State file sync —
.harvest-state.jsonand.hygiene-state.jsonare committed, so two machines running the same pipeline agree on what's already been processed instead of re-doing the work. - Network boundary as access gate — the suggested deployment is over Tailscale or a VPN, so the network enforces who can reach the wiki at all. Simple and sufficient for personal/family/small-team use.
Explicit scope: memex is deliberately not enterprise knowledge management. No audit trails, no fine-grained permissions, no compliance story. If you need any of that, you need a different architecture. This is for the personal and small-team case where git + Tailscale is the right amount of rigor.
The biggest layer — active upkeep
The other six extensions are important, but this is the one that makes or breaks the pattern in practice. The community data is unambiguous:
- People who automate the lint schedule → wikis healthy at 6+ months
- People who rely on "I'll remember to lint" → wikis abandoned at 6 weeks
The entire automation layer of this repo exists to remove upkeep as a thing the human has to think about:
| Cadence | Job | Purpose |
|---|---|---|
| Every 15 min | wiki-sync.sh |
Commit/pull/push — cross-machine sync |
| Every 2 hours | wiki-sync.sh full |
Full sync + qmd reindex |
| Every hour | mine-conversations.sh --extract-only |
Capture new Claude Code sessions (no LLM) |
| Daily 2am | summarize-conversations.py --claude + index |
Classify + summarize (LLM) |
| Daily 3am | wiki-maintain.sh |
Harvest + quick hygiene + reindex |
| Weekly Sun 4am | wiki-maintain.sh --hygiene-only --full |
LLM-powered duplicate/contradiction/cross-ref detection |
If you disable all of these, you get the same outcome as every abandoned wiki: six-week half-life. The scripts aren't optional convenience — they're the load-bearing automation that lets the pattern actually compound over months and years instead of requiring a disciplined human to keep it alive.
What was borrowed from where
This repo is a synthesis of two ideas with an automation layer on top:
From Karpathy
- The core pattern: LLM-maintained persistent wiki, compile at ingest time instead of retrieve at query time
- Separation of
raw/(immutable sources) fromwiki/(compiled pages) CLAUDE.mdas the schema that disciplines the agent- Periodic "lint" passes to catch orphans, contradictions, missing refs
- The idea that the wiki becomes fine-tuning material over time
From mempalace
- Wings = per-person or per-project namespaces → this repo uses
project codes (
mc,wiki,web, etc.) as the same thing inconversations/<project>/ - Rooms = topics within a wing → the
topics:frontmatter on conversation files - Halls = memory-type corridors (fact / event / discovery /
preference / advice / tooling) → the
halls:frontmatter field classified by the summarizer - Closets = summary layer → the summary body of each summarized conversation
- Drawers = verbatim archive, never lost → the extracted
conversation transcripts under
conversations/<project>/*.md - Tunnels = cross-wing connections → the
related:frontmatter linking conversations to wiki pages - Wing + room structural filtering gives a documented +34% retrieval boost over flat search
The MemPalace taxonomy solved a problem Karpathy's pattern doesn't address: how do you navigate a growing corpus without reading everything? The answer is to give the corpus structural metadata at ingest time, then filter on that metadata before doing semantic search. This repo borrows that wholesale.
What this repo adds
- Automation layer tying the pieces together with cron-friendly orchestration
- Staging pipeline as a human-in-the-loop checkpoint for automated content
- Confidence decay + auto-archive + auto-restore as the "retention curve" that community analysis identified as critical for long-term wiki health
qmdintegration as the scalable search layer (chosen over ChromaDB because it uses the same markdown storage as the wiki — one index to maintain, not two)- Hygiene reports with fixed vs needs-review separation so automation handles mechanical fixes and humans handle ambiguity
- Cross-machine sync via git with markdown merge-union so the same wiki lives on multiple machines without merge hell
What memex deliberately doesn't try to do
Five things memex is explicitly scoped around — not because they're unsolvable, but because solving them well requires a different kind of architecture than a personal/small-team wiki. If any of these are dealbreakers for your use case, memex is probably not the right fit:
- Enterprise scale — millions of documents, hundreds of users, RBAC, compliance: these need real enterprise knowledge management infrastructure. memex is tuned for personal and small-team use.
- True semantic retrieval at massive scale —
qmdhybrid search works great up to thousands of pages. At millions, a dedicated vector database with specialized retrieval wins. - Replacing your own learning — memex is an augmentation layer, not a substitute for reading. Used well, it's a memory aid; used as a bypass, it just lets you forget more.
- Precision-critical source of truth — for legal, medical, or regulatory data, memex is a drafting tool. Human domain-expert review still owns the final call.
- Access control — the network boundary (Tailscale) is the fastest path to "only authorized people can reach it." memex itself doesn't enforce permissions inside that boundary.
These are scope decisions, not unfinished work. memex is the best personal/small-team answer to Karpathy's pattern I could build; it's not trying to be every answer.
Further reading
- The original Karpathy gist — the concept
- mempalace — the structural memory layer
- Signal & Noise interactive analysis — the design rationale this document summarizes (live interactive version)
artifacts/signal-and-noise.html— self-contained archive of the same analysis, works offline- README — the concept pitch
- ARCHITECTURE.md — component deep-dive
- SETUP.md — installation
- CUSTOMIZE.md — adapting for non-Claude-Code setups