KB Build Note

This directory is a wiki-style synthesis layer over the canonical by-photo corpus. It is not a replacement for the corpus.

The corpus is data/processed/markdown/by-photo/, 447 direct-transcription Markdown files (one per source photo). The KB exists to make those 447 pages discoverable, navigable, and cross-linked without summarising away their detail.

What This KB Adds

layerpathrole
Index / sidebarindex.md, _sidebar.mdLanding page and navigation.
Mapsmaps/Whole-corpus indexes (document-outline, nav-path-index, evidence-map, pilot-source-coverage) and topic-specific planning maps (alpha-synuclein-source-boundary).
Sectionssections/Topic-cluster aggregator pages. Each section lists every source assigned to it, grouped by nav_path root.
Source catalogmaps/source-catalog.mdFlat catalog of all 447 source notes in chronological capture order.
Source notessources/<stem>.mdOne stub per by-photo file, carrying provenance fields (page label, nav path, headings, uncertain spans).
Topicstopics/Narrative synthesis pages. GBA-PD pilot range: gba-pd, gba-therapeutics, biomarkers. Whole-section / cross-section syntheses: parkin, inflammation, mitochondria, biomarkers-outcomes, pet-imaging, alpha-synuclein (Tier 1; see boundary map for Tier 2 / Tier 3 delegation), therapeutic-programs (program-routing map).
Per-nav indexestopics/by-nav/154 generated indexes, one per first-level nav_path value. Complements sections/: section pages aggregate by topic cluster, by-nav pages aggregate by exact Word heading.
Entitiesentities/compounds/, entities/programs/Per-entity pages. Compounds: eliglustat, ambroxol, venglustat. Programs: pr001, parkn-gt (PFR-4249-100), nlrp3-inhibitor (Marianthi, PFR-4231-100).
Templatestemplates/Boilerplate for new topic pages.
Loglog.mdKB build log.

Build Rules

  1. Canonical content stays in by-photo Markdown. KB pages link back; they do not copy table content verbatim and they do not paraphrase it. When a table or figure matters, link the by-photo file and let the reader open it.
  2. No new operator narration. Section pages, source notes, and the catalog only state what is observable from front matter (page_label, nav_path, source_headings, related_photos, quality_metrics) or the by-photo body’s own headings. They never describe the photo, the work, or the worker.
  3. Provenance is required. Every section row, every source-note entry, and every catalog row links to a by-photo file via data/processed/markdown/by-photo/<stem>.md.
  4. No raw photo staging. data/raw/photos/ is gitignored. KB pages may reference the raw-photo path as text metadata, but never copy the .jpg into the asset folder or commit it.
  5. No image re-embedding. KB pages do not re-embed figure assets; the embedded-image policy lives at the by-photo level (docs/decisions/2026-04-29-body-purity-and-figure-only-embeds.md).
  6. Section assignment is heuristic, not authoritative. Sources are bucketed into a single primary section based on their nav_path root and selected keywords. The full nav_path is preserved in the source note and in maps/nav-path-index.md so a reader can disambiguate.
  7. Synthesis is opt-in. topics/ and entities/ pages contain narrative synthesis only when a human or future review has actually read across the sources. New synthesis follows the template at templates/topic.md and keeps source links next to claims.

Section Catalogue

Eighteen sections cover all 447 sources. The mapping is in _sidebar.md. Boundaries follow the document’s own nav_path clusters rather than externally imposed taxonomy:

  • gba-pd-asyn (198) - the dominant Pipeline of GD & GBA-PD arc and its α-synuclein supplement (animal models, antibodies, postmortem, propagation, biobanks/CEI).
  • parkin (46) - Parkin protein / PD / pS65-Ub, PARKN GT (PFR-4249-100), GAPFREE3, PINK-1.
  • inflammation (42) - Pipeline of Inflammation, NLRP3, pyroptosis, CAPS, Complement / C5aR1, Havrda, In Vivo strategy (Katy), 4 LPS.
  • biomarkers-outcomes (27) - [BIOMARKER] validation/qualification, clinical scales (UPDRS, MoCA, H&Y, SCOPA-AUT, RBD/RBDQ), NFL, SILK, retina, synaptic change.
  • mitochondria (19) - mtDNA, mitophagy, 31P MRS, MC1 PET, structure / Complex I / MAM, MEG / metabolomics / MIBG, assessment summary.
  • molecular-biology (18) - [MOLECULAR BIOLOGY] / [Protein], proteomics, transcriptome, omics, assays of protein.
  • pet-imaging (16) - PET / tracer / DATscan / neuromelanin / 7T MRI / VMAT-2, immunoPET, PET for astrocyte.
  • operations (14) - FY budgets, KPI-linked projects, milestones, workflow, sharefolder organisation, reactome safety, phospholipidosis.
  • pk-gt-pharmacology (11) - [PK] / [PHARMACOLOGY] / [GT], AAV / capsid / promoter, ICM / route of administration, life-cycle.
  • clinical-pd (9) - diagnosis of PD, prodromal PD, psychosis, dyskinesia, Pipeline of PD overview.
  • samples-collaborations (9) - MJF / brain banks / biobanks, NDU, P2P, Burton/Greenamyre/Pittsburgh labs, shipment, secondment.
  • genetics-pathway (9) - pathogenicity of variant, GWAS, PRS, eQTL, pathway analysis, genetic testing.
  • msa (9) - Diagnosis / outcome measures (UMSARS), aSyn in MSA, pathology, Pipeline MSA.
  • other-mechanisms (7) - 기타 MOA들 (TREM2, TAU, TDP43, TMEM, σ1R, TRAP1, UPS, PGRN).
  • lysosome-autophagy (5) - lysosomal enzymes, macro / micro / CMA, TRPML1, Niemann-Pick / NPC.
  • lrrk2 (4) - DNL201 / DNL151, other LRRK2 pipeline.
  • cgas-cgamp (2) - cGAS / cGAMP / AGS / senescence.
  • microglia-imaging (2) - microglial imaging / TSPO.

Maintenance Rules

  • When the canonical by-photo file is edited, the source note in sources/<stem>.md should be checked for nav_path / heading drift.
  • When new section or topic pages are added, update _sidebar.md and index.md.
  • Do not add evidence_images_not_embedded paths or helper crops here; the corpus-only baseline rule is in docs/decisions/2026-05-01-audit-status.md.
  • Do not add operator narration to KB pages either. The body-purity rule applies upstream (in by-photo Markdown), but the KB inherits the same spirit: report observable provenance, don’t editorialise.

References

  • Workflow: docs/workflow/direct-transcription.md
  • Body-purity decision: docs/decisions/2026-04-29-body-purity-and-figure-only-embeds.md
  • Audit status / corpus-only baseline (also defines the Uncertain Spans retention policy that KB pages preserve as review targets): docs/decisions/2026-05-01-audit-status.md
  • KB wiki v1 status / audit note (scope, completed inventory, verification snapshot, remaining follow-up): docs/decisions/2026-05-03-kb-wiki-v1-status.md
  • Repo guide: AGENTS.md