Wintersalmon | Blog

GraphKnow pivot: from RAG-chat I never used to a wiki my LLM workflow needs

5 min read

GraphKnow exists so the rest of the LLM-powered dev workflow has memory: I was burning task-logs and PR notes faster than I could re-read them, and the chat surface I built first turned out to be the wrong primitive. The pivot, in one sentence: same Mongo / MinIO / Qdrant / Neo4j stack, four new first-class entities (raw_sources, canonical_markdown, wiki_pages, wiki_log), Slice 0 deployed at https://graphknow.wintersalmon.com on 2026-04-22.

  • The Open WebUI clone phase shipped 215 unit tests, 0 failures, biome clean — pointed at the wrong product.
  • The new shape: raw inputs immutable, canonical markdown 1:1 derived, wiki pages compounding, wiki_log capped collection as audit trail.
  • Slice 0 acceptance: 27 Go unit tests pass; RETURN apoc.version() returns 5.26.23; bearer middleware returns 401 without token, 404 with token.
  • Lesson: build the data plane first, the product surface last. The data plane survived the pivot untouched. The chat surface didn’t.

The wiki is supporting infrastructure for the workflow, not a standalone product

The vehicle for the LLM dev-flow experiment is building games on a daily / weekly cadence. Each cycle drops task-logs, ADRs, blog drafts, and code reviews into the tree, and without a compounding store the next cycle starts cold. Obsidian got me partway, but I was hand-maintaining [[link]]s and resenting it. GraphKnow is the memory layer — same monorepo, same FluxCD pipeline as the games, different shape of artifact.

The Open WebUI clone phase was high-quality and pointed at the wrong product

Before the pivot I cloned Open WebUI’s chat surface into apps/graphknow-client/ over 11 slices: tree-based message history, SSE via fetch + AbortController, Zustand + localStorage, secure markdown (marked → DOMPurify → highlight.js → DOMPurify), unified resizable sidebar, mobile overlay. Final tally: 215 tests, 0 failures, biome clean.

The full duplication log is at docs/task-log/archive/2026/graphknow-openwebui-duplication.md. Reading it back is uncomfortable — the agents did what I asked, the reviewers caught real CRITICAL bugs, none of it mattered because the PM (me) had pointed the team at messages when the primitive I cared about was pages.

The new shape: four first-class entities, not “documents”

raw_sources         // immutable user uploads (PDF, MD, CSV, DOCX, TXT)
canonical_markdown  // 1:1 derived plain MD, normalized + diffable
wiki_pages          // LLM-generated wiki entries — the output product
wiki_log            // append-only audit trail, capped collection

A wiki_pages document carries slug, title, markdownText, tags[], derivedFromSources[], outboundLinks[], and a denormalized inboundLinks[] recomputed every Stage B run. Reads dominate, the writer is one process — denormalize.

Same data plane, repurposed in place

The four stores in the graphknow namespace did not move; their meaning did:

Store Old role New role
MongoDB (gk-rs0) conversation + document metadata source/page metadata + wiki_log capped collection
MinIO bucket graphknow uploaded documents raw bytes + canonical/{sourceId}.md
Qdrant document chunk embeddings wiki_pages collection, 768-dim cosine, nomic-embed-text
Neo4j + APOC doc-to-topic graph (WikiPage)-[:LINKS_TO]->(WikiPage), :TAGGED, :DERIVED_FROM
Ollama at 192.168.10.3:11434 RAG generation wikify (gemma4:26b) + embed (nomic-embed-text) — direct, not via cluster llm-gateway

Three pipeline stages in apps/graphknow/internal/pipeline/, each idempotent and resumable: A (raw → canonical via per-format adapters + LLM cleanup), B (canonical → wiki page via gemma4:26b), C (chunk → embed → upsert Qdrant + Neo4j).

Slice 0 proved the bones before any pipeline code

The point of Slice 0 was to deploy the new shape before writing pipeline logic — empty 4-tab shell behind auth, all four stores live, both Go binaries (service + worker) running. Auth was the messy bit. The original LoginPage predated Cloudflare Access and showed a manual bearer-token entry form the user has no way to satisfy. The fix was a silent token-exchange endpoint outside the bearer middleware:

// POST /api/auth/token — outside the bearer-protected group.
// Trusts Cf-Access-Jwt-Assertion presence; service is only reachable
// through the CF Access tunnel.
if !h.devBypass && r.Header.Get("Cf-Access-Jwt-Assertion") == "" {
    respondError(w, 401, "no_cf_assertion")
    return
}
respondJSON(w, 200, map[string]string{"token": h.bearerToken})

LoginPage became a “Connecting…” status screen with four states (connecting, connected, denied, error) and ProtectedRoute does the same on deep-link visits — including a cancelled flag to kill state updates after unmount, which code review caught and I never would have.

The other Slice 0 lesson was Flux. Work package WP-1 deleted the legacy myapps/graphknow-* manifests on refactor/graphknow-personal-wiki. The cluster sat for ~24 hours rebuilding the old deployments from Flux’s inventory because Flux only reads main. The fix was PR #384 — merge to main so Flux could see the new world and prune. Branch-merging is the way to clean Flux state; kubectl delete on Flux-owned resources is a busy-loop.

What this changes going forward

Pivots are cheap when the data plane is generic and expensive when the surface is opinionated. The Mongo / MinIO / Qdrant / Neo4j / Ollama stack survived untouched. The chat clone didn’t. Future-me checklist: data plane first, product surface last, never the other way around. The wiki is now the persistence layer the rest of the experiment writes into. #graphknow #pivot

AI workflow note

The chat-clone phase used a coordinated multi-agent team (researcher, tech lead, frontend engineers, code/security reviewers) and produced 215 polished tests around the wrong primitive. For the pivot itself I went the other way — the architect agent walked through the data-plane-vs-surface separation explicitly, and I forced myself to write 01-refactor-plan.md (vision, target UX, architecture, data model, ingest pipeline, build slices, access control, open questions) before any code. Letting agents go straight to implementation is what gets you 215 tests around messages when you wanted pages. Writing the plan first is what catches it. The doc-updater agent has been wired up as a recorder, appending entries to 99-progress.md after every meaningful event — that habit is what made this rewrite possible.


Hungjoon

I'm Hungjoon, a software engineer based in South Korea. This is my long-form notebook — homelab, Kubernetes, AI infra, and whatever else keeps me up at night.