Audit dashboard
This dashboard is auto-regenerated on every build from the wiki’s audit harness. It is the single transparency surface for reading “what is the current Cochrane-level defensibility of the wiki, by dimension.” External reviewers, regulatory affairs, and brand legal teams auditing the wiki should land here.
Generated: 2026-05-17. Baseline snapshot: 2026-05-17-baseline.
Scorecard
| Dimension | Score | Status |
|---|---|---|
| Taxonomy v2.0 coverage (destination pages exist for HMTc rows) | 277 / 277 (100%) | PASS |
| Routing audit (unresolved source routings) | 0 unresolved | PASS |
| Routing rows generated | 661 | — |
| Ingredient mandatory sections (non-stub pages) | 237 / 237 (100%) | PASS |
| Product mandatory sections (non-stub pages) | 84 / 85 (99%) | WARN |
| Routing discipline check 1 (sources with product-shaped matrices but empty products[]) | 1 failures | WARN |
| Routing discipline check 2 (HMTc-locked products with zero routed sources) | 0 of 290 | INFO |
| Source page DOI completeness | 0 missing | PASS |
| Claim provenance — body-prose numerical claims with traceable citation | 3845/3845 (100%) | PASS |
Taxonomy v2.0 coverage
Total subcategories defined by HMTc Taxonomy v2.0: 277. Pages scaffolded or content-filled: 277 (100%). Missing: 0.
- Exact frontmatter declaration (hmtc_category + hmtc_row): 260
- Slug-alias match (frontmatter needs backfill): 17
- Missing destination pages: 0
Detail report: taxonomy-coverage-audit.md. Full PRISMA-equivalent: coverage.
Routing audit
Routing layer (CLAUDE.md Part 5b) fans source-page declarations of products: [<slug>] to product-category pages. 661 routing rows in current audit.
- Unresolved targets (source frontmatter declares a slug that doesn’t exist): 0
- Malformed sources (missing required frontmatter fields): 568 (non-blocking)
Detail: product_source_routing_audit.csv, routing_unresolved.csv, routing_malformed.csv.
Routing discipline
Check 1 — sources with product-shaped matrices: slugs but empty products: [] and no pending_category_lock flag. These are sources that should route to a product page but the frontmatter doesn’t declare it.
- Sources scanned: 920
- Sources with product-shaped matrices: 358
- Failures: 1
Check 2 — HMTc-locked product pages (carrying hmtc_category + hmtc_row) with zero routed sources. These are scaffolded destinations that haven’t received their Phase 1 ingest yet.
- HMTc-locked products: 290
- Products with no routed sources: 0
Check 2 is informational, not a failure: the scaffolded universe (277 taxonomy rows) is larger than the current ingested universe (897 source pages); rows without sources are the priority queue for the next ingest pass, not defects.
Detail: routing-discipline-audit.csv.
Ingredient mandatory sections
Non-stub ingredient pages must carry 8 H2 sections per CLAUDE.md Part 6 (Why this commodity accumulates / Heavy metal contamination profile / Ranges by source, region, and variety / Processing effects / Ingredient-derivative risk / Mitigation options / Regulatory limits that apply / Sources). Stubs (untouched pages with all-pending contamination_profile) are exempt.
- Non-stub passing: 237 of 237 (100%)
- Non-stub failing: 0
Detail: ingredient-mandatory-sections-audit.csv.
Product mandatory sections
Non-stub product-category pages must carry 10 H2 sections per CLAUDE.md Part 6 (Who this page is for / Methodology / Literature Evidence Summary / Source Evidence Inventory / Broad Product Context / Federal/Regulatory Limits / Levers to reduce contamination / How standards math uses this page / Historical recalls and enforcement / Sources). Stubs (scaffolded pages with no routed sources) are exempt.
- Non-stub passing: 84 of 85 (99%)
- Non-stub failing: 1
- Stubs (exempt): 235
Detail: product-mandatory-sections-audit.csv.
Source page corpus
Total source pages in wiki/sources/: 920.
By evidence tier:
- A: 804
- B: 112
- C: 4
By source type:
- agency-report: 3
- book-chapter: 3
- conference-abstract: 1
- conference-proceedings: 1
- dataset: 1
- gov-data: 5
- gov-guidance: 3
- gov-regulation: 1
- gov-report: 53
- government-guidance: 1
- government-regulatory: 2
- government-report: 32
- government_dataset: 5
- government_documentation: 1
- government_report: 3
- industry: 1
- industry-application-note: 1
- news-article: 1
- news-commentary: 1
- ngo-report: 2
- nonprofit: 1
- peer-reviewed: 754
- regulation: 4
- regulatory: 2
- review: 31
- review-chapter: 1
- study: 1
- systematic-review: 1
- textbook-chapter: 3
- thesis: 1
DOI completeness: 920 of 920 sources have a DOI or a no_doi_assigned: true marker. Missing DOI metadata: 0.
Claim provenance (trace-every-claim audit v1)
Every numerical claim in body prose on ingredient and product pages — matches the regex <number> <unit> where unit is one of ppb, ppm, mg/kg, µg/kg, µg/g, mg/L, µg/L, % — must trace to a citation in the same H2 section (before or after the claim is acceptable; sections are treated as logical units of provenance). Valid citations: [[sources/...]], [[regulations/...]] (for regulatory limits), [[testing/...]] (for analytical-method floors).
- Total claims scanned: 3845
- Supported: 3845 (100%)
- Unsupported: 0
By page type:
- ingredient: 1997/1997 (100%)
- product: 1848/1848 (100%)
Cochrane target: >95% supported. Current baseline reports the gap to that target as a follow-up queue. The v1 audit does not yet verify that the cited source actually contains the claimed value (that requires PDF reading per claim) — only that a citation exists. v2 will add PDF-side verification.
Detail: claim-provenance-audit.csv (per-claim records with section context).
What this dashboard does not yet measure
The audits above are the mechanical, automatable dimensions of defensibility. The wiki’s Cochrane-level target also requires audits that this dashboard does not yet produce:
- PDF-side verification. Trace-every-claim audit v2 should follow each
[[sources/...]]link to the cited source page, then to the raw PDF inraw/markdown/<FM_id>/orraw/manual-fetch/..., and verify the claimed value appears in the source. Today’s v1 audit only verifies a citation exists; not that the citation is correct. - Synthesis reproducibility audit. Pick 5 populated
contamination_profilecells. Re-derive the values from contributing sources from scratch (without consulting the existing page). Compare. Target: 100% reconstructible. Not yet automated; requires deep reading of contributing sources. - Inter-rater reliability. Re-ingest 20 source pages with a second extractor blind to the existing pages. Compare values + metadata. Compute κ. Target: κ > 0.7 extraction, > 0.8 inclusion. Not yet operationalized.
- External peer review. Three independent reviewers (toxicology academic, brand RA veteran, plaintiff’s-side class-action attorney) read the wiki and report defects. The commissioning brief is drafted at external-review-brief and is ready for Karen to send; reviewers not yet engaged.
These four audits are the Cochrane-level qualifying audits. They are documented here as gaps in the harness so the audit dashboard does not present a falsely complete picture.
Audit harness provenance
This dashboard is generated by tools/build-audit-dashboard.mjs on every prebuild. Inputs:
data/evidence/taxonomy-coverage-audit.csv(output oftools/taxonomy/audit-coverage.mjs)data/evidence/product_source_routing_audit.csv,routing_unresolved.csv,routing_malformed.csv(output oftools/evidence/build-routing-audit.mjs)data/evidence/routing-discipline-summary.json(output oftools/audit-routing-discipline.mjs)- Live re-run of
tools/audit-ingredient-mandatory-sections.mjsandtools/audit-product-mandatory-sections.mjs - Direct read of
wiki/sources/*.mdfrontmatter for evidence tier + DOI accounting