docs: reframe as extensions + replace Signal & Noise artifact

Two changes, one commit: 1. Reframe "weaknesses" as "extensions memex adds": Karpathy's gist is a concept pitch, not an implementation. Reframe the seven places memex extends the pattern as engineering-layer additions rather than problems to fix. Cleaner narrative — memex builds on Karpathy's work instead of critiquing it. Touches README.md (Why each part exists + Credits) and DESIGN-RATIONALE.md (section titles, trade-off framing, biggest layer section, scope note at the end). 2. Replace docs/artifacts/signal-and-noise.html with the full upstream version: The earlier abbreviated copy dropped the MemPalace integration tab, the detailed mitigation steps with effort pips, the impact before/after cards, and the qmd vs ChromaDB comparison. This restores all of that. Also swaps self-references from "LLM Wiki" to "memex" while leaving external "LLM Wiki v2" community citations alone (those refer to a separate pattern and aren't ours to rename). The live hosted copy at eric-turner.com/memex/signal-and-noise.html has already been updated via scp — Hugo picks up static changes with --poll 1s so the public URL reflects this file immediately.
2026-04-12 22:01:31 -06:00
parent 2a37e33fd6
commit 4c6b7609a1
3 changed files with 1191 additions and 238 deletions
@@ -14,10 +14,11 @@ original persistent-wiki pattern:
 > — same content, works offline

 The analysis walks through the pattern's seven genuine strengths, seven
-real weaknesses, and concrete mitigations for each weakness. This repo
-is the implementation of those mitigations. If you want to understand
-*why* a component exists, the interactive version has the longer-form
-argument; this document is the condensed written version.
+places where it needs an engineering layer to be practical, and the
+concrete extension for each. memex is the implementation of those
+extensions. If you want to understand *why* a component exists, the
+interactive version has the longer-form argument; this document is the
+condensed written version.

 ---

@@ -38,20 +39,24 @@ repo preserves all of them:

 ---

-## Where the pattern is genuinely weak — and how this repo answers
+## Where memex extends the pattern

-The analysis identified seven real weaknesses. Five have direct
-mitigations in this repo; two remain open trade-offs you should be aware
-of.
+Karpathy's gist is a concept pitch. He was explicit that he was sharing
+an "idea file" for others to build on, not publishing a working
+implementation. The analysis identified seven places where the core idea
+needs an engineering layer to become practical day-to-day — five have
+first-class answers in memex, and two remain scoped-out trade-offs that
+the architecture cleanly acknowledges.

-### 1. Errors persist and compound
+### 1. Claim freshness and reversibility

-**The problem**: Unlike RAG — where a hallucination is ephemeral and the
-next query starts clean — an LLM wiki persists its mistakes. If the LLM
-incorrectly links two concepts at ingest time, future ingests build on
-that wrong prior.
+**The gap**: Unlike RAG — where a hallucination is ephemeral and the
+next query starts clean — an LLM-maintained wiki is stateful. If a
+claim is wrong at ingest time, it stays wrong until something corrects
+it. For the pattern to work long-term, claims need a way to earn trust
+over time and lose it when unused.

-**How this repo mitigates**:
+**How memex extends it**:

 - **`confidence` field** — every page carries `high`/`medium`/`low` with
  decay based on `last_verified`. Wrong claims aren't treated as
@@ -71,13 +76,14 @@ that wrong prior.
  two chances to get caught (AI compile + human review) before they
  become persistent.

-### 2. Hard scale ceiling at ~50K tokens
+### 2. Scalable search beyond the context window

-**The problem**: The wiki approach stops working when `index.md` no
-longer fits in context. Karpathy's own wiki was ~100 articles / 400K
-words — already near the ceiling.
+**The gap**: The pattern works beautifully up to ~100 articles, where
+`index.md` still fits in context. Karpathy's own wiki was right at the
+ceiling. Past that point, the agent needs a real search layer — loading
+the full index stops being practical.

-**How this repo mitigates**:
+**How memex extends it**:

 - **`qmd` from day one** — `qmd` (BM25 + vector + LLM re-ranking) is set
  up in the default configuration so the agent never has to load the
@@ -93,14 +99,14 @@ words — already near the ceiling.
  `includeByDefault: false`, so archived pages don't eat context until
  explicitly queried.

-### 3. Manual cross-checking burden returns in precision-critical domains
+### 3. Traceable sources for every claim

-**The problem**: For API specs, version constraints, legal records, and
-medical protocols, LLM-generated content needs human verification. The
-maintenance burden you thought you'd eliminated comes back as
-verification overhead.
+**The gap**: In precision-sensitive domains (API specs, version
+constraints, legal records, medical protocols), LLM-generated content
+needs to be verifiable against a source. For the pattern to work in
+those contexts, every claim needs to trace back to something immutable.

-**How this repo mitigates**:
+**How memex extends it**:

 - **Staging workflow** — every automated page goes through human review.
  For precision-critical content, that review IS the cross-check. The
@@ -120,15 +126,16 @@ medical, compliance), no amount of automation replaces domain-expert
 review. If that's your use case, treat this repo as a *drafting* tool,
 not a canonical source.

-### 4. Knowledge staleness without active upkeep
+### 4. Continuous feed without manual discipline

-**The problem**: Community analysis of 120+ comments on Karpathy's gist
-found this is the #1 failure mode. Most people who try the pattern get
+**The gap**: Community analysis of 120+ comments on Karpathy's gist
+converged on one clear finding: this is the #1 friction point. Most
+people who try the pattern get
 the folder structure right and still end up with a wiki that slowly
 becomes unreliable because they stop feeding it. Six-week half-life is
 typical.

-**How this repo mitigates** (this is the biggest thing):
+**How memex extends it** (this is the biggest layer):

 - **Automation replaces human discipline** — daily cron runs
  `wiki-maintain.sh` (harvest + hygiene + qmd reindex); weekly cron runs
@@ -147,17 +154,20 @@ typical.
  flags the things that *do* need human judgment. Everything else is
  auto-fixed.

-This is the single biggest reason this repo exists. The automation
-layer is entirely about removing "I forgot to lint" as a failure mode.
+This is the single biggest layer memex adds. Nothing about it is
+exotic — it's a cron-scheduled pipeline that runs the scripts you'd
+otherwise have to remember to run. That's the whole trick.

-### 5. Cognitive outsourcing risk
+### 5. Keeping the human engaged with their own knowledge

-**The problem**: Hacker News critics argued that the bookkeeping
+**The gap**: Hacker News critics pointed out that the bookkeeping
 Karpathy outsources — filing, cross-referencing, summarizing — is
-precisely where genuine understanding forms. Outsource it and you end up
-with a comprehensive wiki you haven't internalized.
+precisely where genuine understanding forms. If the LLM does all of
+it, you can end up with a comprehensive wiki you haven't internalized.
+For the pattern to be an actual memory aid and not a false one, the
+human needs touchpoints that keep them engaged.

-**How this repo mitigates**:
+**How memex extends it**:

 - **Staging review is a forcing function** — you see every automated
  page before it lands. Even skimming forces engagement with the
@@ -169,19 +179,21 @@ with a comprehensive wiki you haven't internalized.
  the agent reads at session start. You read it too (or the agent reads
  it to you) — ongoing re-exposure to your own knowledge base.

-**Residual trade-off**: This is a real concern even with mitigations.
-The wiki is designed as *augmentation*, not *replacement*. If you
-never read your own wiki and only consult it through the agent, you're
-in the outsourcing failure mode. The fix is discipline, not
-architecture.
+**Caveat**: memex is designed as *augmentation*, not *replacement*.
+It's most valuable when you engage with it actively — reading your own
+wake-up briefing, spot-checking promoted pages, noticing decay flags.
+If you only consult the wiki through the agent and never look at it
+yourself, you've outsourced the learning. That's a usage pattern
+choice, not an architecture problem.

-### 6. Weaker semantic retrieval than RAG at scale
+### 6. Hybrid retrieval — structure and semantics

-**The problem**: At large corpora, vector embeddings find semantically
-related content across different wording in ways explicit wikilinks
-can't match.
+**The gap**: Explicit wikilinks catch direct topic references but miss
+semantic neighbors that use different wording. At scale, the pattern
+benefits from vector similarity to find cross-topic connections the
+human (or the LLM at ingest time) didn't think to link manually.

-**How this repo mitigates**:
+**How memex extends it**:

 - **`qmd` is hybrid (BM25 + vector)** — not just keyword search. Vector
  similarity is built into the retrieval pipeline from day one.
@@ -198,33 +210,38 @@ can't match.
 proper vector DB with specialized retrieval wins. This repo is for
 personal / small-team scale where the hybrid approach is sufficient.

-### 7. No access control or multi-user support
+### 7. Cross-machine collaboration

-**The problem**: It's a folder of markdown files. No RBAC, no audit
-logging, no concurrency handling, no permissions model.
+**The gap**: Karpathy's gist describes a single-user, single-machine
+setup. In practice, people work from multiple machines (laptop,
+workstation, server) and sometimes collaborate with small teams. The
+pattern needs a sync story that handles concurrent writes gracefully.

-**How this repo mitigates**:
+**How memex extends it**:

 - **Git-based sync with merge-union** — concurrent writes on different
  machines auto-resolve because markdown is set to `merge=union` in
  `.gitattributes`. Both sides win.
- **Network boundary as soft access control** — the suggested
-  deployment is over Tailscale or a VPN, so the network does the work a
-  RBAC layer would otherwise do. Not enterprise-grade, but sufficient
-  for personal/family/small-team use.
+- **State file sync** — `.harvest-state.json` and `.hygiene-state.json`
+  are committed, so two machines running the same pipeline agree on
+  what's already been processed instead of re-doing the work.
+- **Network boundary as access gate** — the suggested deployment is
+  over Tailscale or a VPN, so the network enforces who can reach the
+  wiki at all. Simple and sufficient for personal/family/small-team
+  use.

-**Residual trade-off**: **This is the big one.** The repo is not a
-replacement for enterprise knowledge management. No audit trails, no
-fine-grained permissions, no compliance story. If you need any of
-that, you need a different architecture. This repo is explicitly
-scoped to the personal/small-team use case.
+**Explicit scope**: memex is **deliberately not** enterprise knowledge
+management. No audit trails, no fine-grained permissions, no compliance
+story. If you need any of that, you need a different architecture.
+This is for the personal and small-team case where git + Tailscale is
+the right amount of rigor.

 ---

-## The #1 failure mode — active upkeep
+## The biggest layer — active upkeep

-Every other weakness has a mitigation. *Active upkeep is the one that
-kills wikis in the wild.* The community data is unambiguous:
+The other six extensions are important, but this is the one that makes
+or breaks the pattern in practice. The community data is unambiguous:

 - People who automate the lint schedule → wikis healthy at 6+ months
 - People who rely on "I'll remember to lint" → wikis abandoned at 6 weeks
@@ -243,8 +260,9 @@ thing the human has to think about:

 If you disable all of these, you get the same outcome as every
 abandoned wiki: six-week half-life. The scripts aren't optional
-convenience — they're the load-bearing answer to the pattern's primary
-failure mode.
+convenience — they're the load-bearing automation that lets the pattern
+actually compound over months and years instead of requiring a
+disciplined human to keep it alive.

 ---

@@ -305,26 +323,32 @@ This repo borrows that wholesale.

 ---

-## Honest residual trade-offs
+## What memex deliberately doesn't try to do

-Five items from the analysis that this repo doesn't fully solve and
-where you should know the limits:
+Five things memex is explicitly scoped around — not because they're
+unsolvable, but because solving them well requires a different kind of
+architecture than a personal/small-team wiki. If any of these are
+dealbreakers for your use case, memex is probably not the right fit:

-1. **Enterprise scale** — this is a personal/small-team tool. Millions
-   of documents, hundreds of users, RBAC, compliance: wrong
-   architecture.
+1. **Enterprise scale** — millions of documents, hundreds of users,
+   RBAC, compliance: these need real enterprise knowledge management
+   infrastructure. memex is tuned for personal and small-team use.
 2. **True semantic retrieval at massive scale** — `qmd` hybrid search
-   is great for thousands of pages, not millions.
-3. **Cognitive outsourcing** — no architecture fix. Discipline
-   yourself to read your own wiki, not just query it through the agent.
-4. **Precision-critical domains** — for legal/medical/regulatory data,
-   use this as a drafting tool, not a source of truth. Human
-   domain-expert review is not replaceable.
-5. **Access control** — network boundary (Tailscale) is the fastest
-   path; nothing in the repo itself enforces permissions.
+   works great up to thousands of pages. At millions, a dedicated
+   vector database with specialized retrieval wins.
+3. **Replacing your own learning** — memex is an augmentation layer,
+   not a substitute for reading. Used well, it's a memory aid; used as
+   a bypass, it just lets you forget more.
+4. **Precision-critical source of truth** — for legal, medical, or
+   regulatory data, memex is a drafting tool. Human domain-expert
+   review still owns the final call.
+5. **Access control** — the network boundary (Tailscale) is the
+   fastest path to "only authorized people can reach it." memex itself
+   doesn't enforce permissions inside that boundary.

-If any of these are dealbreakers for your use case, a different
-architecture is probably what you need.
+These are scope decisions, not unfinished work. memex is the best
+personal/small-team answer to Karpathy's pattern I could build; it's
+not trying to be every answer.

 ---