Files
memex/tests/test_shell_scripts.py
Eric Turner 997aa837de feat(distill): close the MemPalace loop — conversations → wiki pages
Add wiki-distill.py as Phase 1a of the maintenance pipeline. This is
the 8th extension memex adds to Karpathy's pattern and the one that
makes the MemPalace integration a real ingest pipeline instead of
just a searchable archive beside the wiki.

## The gap distill closes

The mining layer was extracting Claude Code sessions, classifying
bullets into halls (fact/discovery/preference/advice/event/tooling),
and tagging topics. The URL harvester scanned conversations for cited
links. Hygiene refreshed last_verified on wiki pages referenced in
related: fields. But none of those steps compiled the knowledge
*inside* the conversations themselves into wiki pages. Decisions,
root causes, and patterns stayed in the summaries forever — findable
via qmd but never synthesized into canonical pages.

## What distill does

Narrow today-filter with historical rollup:

  1. Find all summarized conversations dated TODAY
  2. Extract their topics: — this is the "topics of today" set
  3. For each topic in that set, pull ALL summarized conversations
     across history that share that topic (full historical context)
  4. Extract hall_facts + hall_discoveries + hall_advice bullets
     (the high-signal hall types — skips event/preference/tooling)
  5. Send topic group + wiki index.md to claude -p
  6. Model emits JSON actions[]: new_page / update_page / skip
  7. Write each action to staging/<type>/ with distill provenance
     frontmatter (staged_by: wiki-distill, distill_topic,
     distill_source_conversations, compilation_notes)

First-run bootstrap: uses 7-day lookback instead of today-only so
the state file gets seeded reasonably. After that, daily runs stay
narrow.

Self-triggering: dormant topics that resurface in a new conversation
automatically pull in all historical conversations on that topic via
the rollup. Old knowledge gets distilled when it becomes relevant
again without manual intervention.

## Orchestration — distill BEFORE harvest

wiki-maintain.sh now has Phase 1a (distill) + Phase 1b (harvest):

  1a. wiki-distill.py    — conversations → staging (PRIORITY)
  1b. wiki-harvest.py    — URLs → raw/harvested → staging (supplement)
  2.  wiki-hygiene.py    — decay, archive, repair, checks
  3.  qmd reindex

Conversation content drives the page shape; URL harvesting fills
gaps for external references conversations don't cover. New flags:
--distill-only, --no-distill, --distill-first-run.

## Verified on real wiki

Tested end-to-end on the production wiki with 611 summarized
conversations across 14 wings. First-run dry-run found 116 topic
groups worth distilling (+ 3 too-thin). Tested single-topic compile
with --topic zoho-api: the LLM rolled up 2 conversations (34
bullets), synthesized a proper pattern page with "What / Why /
Known Limitations" structure, linked it to existing wiki pages,
and landed it in staging with full distill provenance. LLM
correctly rejected claude-code-statusline (already well-covered
by an existing live page) — so the "skip" path works.

## Code additions

- scripts/wiki-distill.py (new, ~530 lines)
- scripts/wiki_lib.py: HIGH_SIGNAL_HALLS + parse_conversation_halls
  + high_signal_halls + _flatten_bullet helpers
- scripts/wiki-maintain.sh: Phase 1a distill, new flags
- tests/test_wiki_distill.py (21 new tests — hall parsing, rollup,
  state management, CLI smoke tests)
- tests/test_shell_scripts.py: updated phase-name assertion for
  the Phase 1a/1b split

## Docs additions

- README.md: 8th row in extensions table, updated compounding-loop
  diagram, new wiki-distill.py reference in architecture overview
- docs/DESIGN-RATIONALE.md: new section 8 "Closing the MemPalace
  loop" with full mempalace taxonomy mapping
- docs/ARCHITECTURE.md: wiki-distill.py section, updated phase
  order, updated state file table, updated dep graph
- docs/SETUP.md: updated cron comment, first-run distill guidance,
  verify section test count
- .gitignore: note distill-state.json is committed (sync across
  machines), not gitignored
- docs/artifacts/signal-and-noise.html: new "Distill ⬣" top-level
  tab with flow diagram, hall filter table, narrow-today/wide-
  history explanation, staging provenance example

## Tests

192 tests total (+21 new, +1 regression fix), all green in ~1.5s.
2026-04-12 22:34:33 -06:00

212 lines
7.8 KiB
Python

"""Smoke tests for the bash scripts.
Bash scripts are harder to unit-test in isolation — these tests verify
CLI parsing, help text, and dry-run/safe flags work correctly and that
scripts exit cleanly in all the no-op paths.
Cross-platform note: tests invoke scripts via `bash` explicitly, so they
work on both macOS (default /bin/bash) and Linux/WSL. They avoid anything
that requires external state (network, git, LLM).
"""
from __future__ import annotations
import os
import subprocess
from pathlib import Path
from typing import Any
import pytest
from conftest import make_conversation, make_page, make_staging_page
# ---------------------------------------------------------------------------
# wiki-maintain.sh
# ---------------------------------------------------------------------------
class TestWikiMaintainSh:
def test_help_flag(self, run_script) -> None:
result = run_script("wiki-maintain.sh", "--help")
assert result.returncode == 0
assert "Usage:" in result.stdout or "usage:" in result.stdout.lower()
assert "--full" in result.stdout
assert "--harvest-only" in result.stdout
assert "--hygiene-only" in result.stdout
def test_rejects_unknown_flag(self, run_script) -> None:
result = run_script("wiki-maintain.sh", "--bogus")
assert result.returncode != 0
assert "Unknown option" in result.stderr
def test_harvest_only_and_hygiene_only_conflict(self, run_script) -> None:
result = run_script(
"wiki-maintain.sh", "--harvest-only", "--hygiene-only"
)
assert result.returncode != 0
assert "mutually exclusive" in result.stderr
def test_hygiene_only_dry_run_completes(
self, run_script, tmp_wiki: Path
) -> None:
make_page(tmp_wiki, "patterns/one.md")
result = run_script(
"wiki-maintain.sh", "--hygiene-only", "--dry-run", "--no-reindex"
)
assert result.returncode == 0
assert "Phase 2: Hygiene checks" in result.stdout
assert "finished" in result.stdout
def test_phase_1_skipped_in_hygiene_only(
self, run_script, tmp_wiki: Path
) -> None:
result = run_script(
"wiki-maintain.sh", "--hygiene-only", "--dry-run", "--no-reindex"
)
assert result.returncode == 0
# Phase 1a (distill) and Phase 1b (harvest) both skipped in --hygiene-only
assert "Phase 1a: Conversation distillation (skipped)" in result.stdout
assert "Phase 1b: URL harvesting (skipped)" in result.stdout
def test_phase_3_skipped_in_dry_run(
self, run_script, tmp_wiki: Path
) -> None:
make_page(tmp_wiki, "patterns/one.md")
result = run_script(
"wiki-maintain.sh", "--hygiene-only", "--dry-run"
)
assert "Phase 3: qmd reindex (skipped)" in result.stdout
def test_harvest_only_dry_run_completes(
self, run_script, tmp_wiki: Path
) -> None:
# Add a summarized conversation so harvest has something to scan
make_conversation(
tmp_wiki,
"test",
"2026-04-10-test.md",
status="summarized",
body="See https://docs.python.org/3/library/os.html for details.\n",
)
result = run_script(
"wiki-maintain.sh",
"--harvest-only",
"--dry-run",
"--no-compile",
"--no-reindex",
)
assert result.returncode == 0
assert "Phase 2: Hygiene checks (skipped)" in result.stdout
# ---------------------------------------------------------------------------
# wiki-sync.sh
# ---------------------------------------------------------------------------
class TestWikiSyncSh:
def test_status_on_non_git_dir_exits_cleanly(self, run_script) -> None:
"""wiki-sync.sh --status against a non-git dir should fail gracefully.
The tmp_wiki fixture is not a git repo, so git commands will fail.
The script should report the problem without hanging or leaking stack
traces. Any exit code is acceptable as long as it exits in reasonable
time and prints something useful to stdout/stderr.
"""
result = run_script("wiki-sync.sh", "--status", timeout=30)
# Should have produced some output and exited (not hung)
assert result.stdout or result.stderr
assert "Wiki Sync Status" in result.stdout or "not a git" in result.stderr.lower()
# ---------------------------------------------------------------------------
# mine-conversations.sh
# ---------------------------------------------------------------------------
class TestMineConversationsSh:
def test_extract_only_dry_run(self, run_script, tmp_wiki: Path) -> None:
"""mine-conversations.sh --extract-only --dry-run should complete without LLM."""
result = run_script(
"mine-conversations.sh", "--extract-only", "--dry-run", timeout=30
)
assert result.returncode == 0
def test_rejects_unknown_flag(self, run_script) -> None:
result = run_script("mine-conversations.sh", "--bogus-flag")
assert result.returncode != 0
# ---------------------------------------------------------------------------
# Cross-platform sanity — scripts use portable bash syntax
# ---------------------------------------------------------------------------
class TestBashPortability:
"""Verify scripts don't use bashisms that break on macOS /bin/bash 3.2."""
@pytest.mark.parametrize(
"script",
["wiki-maintain.sh", "mine-conversations.sh", "wiki-sync.sh"],
)
def test_shebang_is_env_bash(self, script: str) -> None:
"""All shell scripts should use `#!/usr/bin/env bash` for portability."""
path = Path(__file__).parent.parent / "scripts" / script
first_line = path.read_text().splitlines()[0]
assert first_line == "#!/usr/bin/env bash", (
f"{script} has shebang {first_line!r}, expected #!/usr/bin/env bash"
)
@pytest.mark.parametrize(
"script",
["wiki-maintain.sh", "mine-conversations.sh", "wiki-sync.sh"],
)
def test_uses_strict_mode(self, script: str) -> None:
"""All shell scripts should use `set -euo pipefail` for safe defaults."""
path = Path(__file__).parent.parent / "scripts" / script
text = path.read_text()
assert "set -euo pipefail" in text, f"{script} missing strict mode"
@pytest.mark.parametrize(
"script",
["wiki-maintain.sh", "mine-conversations.sh", "wiki-sync.sh"],
)
def test_bash_syntax_check(self, script: str) -> None:
"""bash -n does a syntax-only parse and catches obvious errors."""
path = Path(__file__).parent.parent / "scripts" / script
result = subprocess.run(
["bash", "-n", str(path)],
capture_output=True,
text=True,
timeout=10,
)
assert result.returncode == 0, f"{script} has bash syntax errors: {result.stderr}"
# ---------------------------------------------------------------------------
# Python script syntax check (smoke)
# ---------------------------------------------------------------------------
class TestPythonSyntax:
@pytest.mark.parametrize(
"script",
[
"wiki_lib.py",
"wiki-harvest.py",
"wiki-staging.py",
"wiki-hygiene.py",
"extract-sessions.py",
"summarize-conversations.py",
"update-conversation-index.py",
],
)
def test_py_compile(self, script: str) -> None:
"""py_compile catches syntax errors without executing the module."""
import py_compile
path = Path(__file__).parent.parent / "scripts" / script
# py_compile.compile raises on error; success returns the .pyc path
py_compile.compile(str(path), doraise=True)