A compounding LLM-maintained knowledge wiki. Synthesis of Andrej Karpathy's persistent-wiki gist and milla-jovovich's mempalace, with an automation layer on top for conversation mining, URL harvesting, human-in-the-loop staging, staleness decay, and hygiene. Includes: - 11 pipeline scripts (extract, summarize, index, harvest, stage, hygiene, maintain, sync, + shared library) - Full docs: README, SETUP, ARCHITECTURE, DESIGN-RATIONALE, CUSTOMIZE - Example CLAUDE.md files (wiki schema + global instructions) tuned for the three-collection qmd setup - 171-test pytest suite (cross-platform, runs in ~1.3s) - MIT licensed
4.6 KiB
Wiki Pipeline Test Suite
Pytest-based test suite covering all 11 scripts in scripts/. Runs on both
macOS and Linux/WSL, uses only the Python standard library + pytest.
Running
# Full suite (from wiki root)
bash tests/run.sh
# Single test file
bash tests/run.sh test_wiki_lib.py
# Single test class or function
bash tests/run.sh test_wiki_hygiene.py::TestArchiveRestore
bash tests/run.sh test_wiki_hygiene.py::TestArchiveRestore::test_restore_reverses_archive
# Pattern matching
bash tests/run.sh -k "archive"
# Verbose
bash tests/run.sh -v
# Stop on first failure
bash tests/run.sh -x
# Or invoke pytest directly from the tests dir
cd tests && python3 -m pytest -v
What's tested
| File | Coverage |
|---|---|
test_wiki_lib.py |
YAML parser, frontmatter round-trip, page iterators, date parsing, content hashing, WIKI_DIR env override |
test_wiki_hygiene.py |
Backfill, confidence decay math, frontmatter repair, archive/restore round-trip, orphan detection, broken-xref fuzzy matching, index drift, empty stubs, conversation refresh signals, auto-restore, staging/archive sync, state drift, hygiene state file, full quick-run idempotency |
test_wiki_staging.py |
List, promote, reject, promote-with-modifies, dry-run, staging index regeneration, path resolution |
test_wiki_harvest.py |
URL classification (harvest/check/skip), private IP detection, URL extraction + filtering, filename derivation, content validation, state management, raw file writing, dry-run CLI smoke test |
test_conversation_pipeline.py |
CLI smoke tests for extract-sessions, summarize-conversations, update-conversation-index; dry-run behavior; help flags; integration test with fake conversation files |
test_shell_scripts.py |
wiki-maintain.sh / mine-conversations.sh / wiki-sync.sh: help, dry-run, mutex flags, bash syntax check, strict-mode check, shebang check, py_compile for all .py scripts |
How it works
Isolation: Every test runs against a disposable tmp_wiki fixture
(pytest tmp_path). The fixture sets the WIKI_DIR environment variable
so all scripts resolve paths against the tmp directory instead of the real
wiki. No test ever touches ~/projects/wiki.
Hyphenated filenames: Scripts like wiki-harvest.py use hyphens, which
Python's import can't handle directly. conftest.py has a
_load_script_module helper that loads a script file by path and exposes
it as a module object.
Clean module state: Each test that loads a module clears any cached
import first, so WIKI_DIR env overrides take effect correctly between
tests.
Subprocess tests (for CLI smoke tests): conftest.py provides a
run_script fixture that invokes a script via python3 or bash with
WIKI_DIR set to the tmp wiki. Uses subprocess.run with capture_output
and a timeout.
Cross-platform
#!/usr/bin/env bashshebangs (tested explicitly)set -euo pipefailin all shell scripts (tested explicitly)bash -nsyntax check on all shell scriptspy_compileon all Python scripts- Uses
pathlibeverywhere — no hardcoded path separators - Uses the Python stdlib only (except pytest itself)
Requirements
- Python 3.11+
pytest— install withpip install --user pytestor your distro's package managerbash(any version — scripts use only portable features)
The tests do NOT require:
claudeCLI (mocked / skipped)trafilaturaorcrawl4ai(only dry-run / classification paths tested)qmd(reindex phase is skipped in tests)- Network access
- The real
~/projects/wikior~/.claude/projectsdirectories
Speed
Full suite runs in ~1 second on a modern laptop. All tests are isolated and independent so they can run in any order and in parallel.
What's NOT tested
- Real LLM calls (
claude -p): too expensive, non-deterministic. Tested: CLI parsing, dry-run paths, mocked error handling. - Real web fetches (trafilatura/crawl4ai): too slow, non-deterministic. Tested: URL classification, filter logic, fetch-result validation.
- Real git operations (wiki-sync.sh): requires a git repo fixture. Tested: script loads, handles non-git dir gracefully, --status exits clean.
- Real qmd indexing: tested elsewhere via
qmd collection listin the setup verification step. - Real Claude Code session JSONL parsing with actual sessions: would
require fixture JSONL files. Tested: CLI parsing, empty-dir behavior,
CLAUDE_PROJECTS_DIRenv override.
These are smoke-tested end-to-end via the integration tests in
test_conversation_pipeline.py and the dry-run paths in
test_shell_scripts.py::TestWikiMaintainSh.