Corpus coverage and methodology transparency

This page is the wiki’s PRISMA-equivalent: a single, auto-regenerated transparency surface showing how many papers were identified in the literature search, how many became source pages, how the HMTc Taxonomy v2.0 universe is covered, and where the gaps are. Brand legal teams, regulatory affairs leads, FDA reviewers, and plaintiff/defense experts auditing the wiki’s defensibility should land here.

The wiki is built against the locked HMTc Comprehensive Testing Category Taxonomy v2.0 (2026-03-30). Every product page, every routing decision, and every coverage count is reconciled to that taxonomy. The taxonomy itself is published at hmtc-v2.json for reference.

This page was regenerated on 2026-05-28 from the current state of the corpus.

The Cochrane-equivalent search-protocol publication — the 10 academic databases queried, the dedupe protocol, the scoring rubric for auto-fetch, and the inclusion / exclusion criteria — is published at search-strategy. Read it alongside this page for the complete defensibility picture.

Literature search flow (PRISMA-equivalent)

  Identification: 23,260 papers triaged from the literature search
         │
         │  Triage tiers (priority bands):
         │    P1: 42
         │    P2: 488
         │    P3: 18
         │    P4: 6,571
         │    P5: 16,141
         ▼
  Screening:      Papers triaged for ingest priority
         │
         │  Papers from year ≥ 2020: 14,343
         ▼
  Eligibility:    1083 source pages currently in wiki/sources/
         │
         │    evidence_tier A: 875
         │    evidence_tier A-tier: 1
         │    evidence_tier B: 193
         │    evidence_tier C: 12
         │    evidence_tier unknown: 2
         ▼
  Included:       Source pages routed to destination pages
         │
         │  Routing rows in product_source_routing_audit.csv: 1098
         ▼
  Synthesized:    Ingredient cells, product rows, metal pages, regulation pages

Of the 23,260 papers identified in the literature search, 1083 have been promoted to source pages — 4.7% of the universe. The remaining papers are tiered by priority (P1 HMT&C Path A candidates, P2 LOQ source candidates, P3 agency-affiliated, P4 high-evidence peer-reviewed 2020+, P5 everything else); ingest is in priority order per CLAUDE.md Part 11.

This is a deliberately small fraction: depth over breadth. The defensibility argument is not ‘we ingested 23,000 papers,’ it is ‘every wiki claim traces to a peer-reviewed source page whose values match the source PDF on audit.’ The ingest pipeline is sized to maintain that property.

Wiki page counts by type

Page typeCount
Source pages1083
Ingredient profiles264
Product-category rows (total)352
Product-category rows (non-stub, has scaffold or content)126
Metal profiles36
Regulation pages57
Mitigation pages6
Microbiome pages1
Testing-method pages2

HMTc Taxonomy v2.0 coverage

Total subcategories defined by the taxonomy: 277. Pages scaffolded or content-filled against the taxonomy: 277 (100%).

Of the 277 matched: 260 carry exact hmtc_category + hmtc_row frontmatter (locked under Step 0 scaffolding); 17 are matched by slug-alias to legacy pages (frontmatter needs backfill). Pages still to be created: 0.

CatNameTotalExactBy aliasMissingCoverage
1Infant and Child Foods (Ages 0-5)9900100%
2Infant and Child Personal Care (Ages 0-5)7700100%
3Grains, Cereals, and Rice Products11920100%
4Fruits, Vegetables, and Produce9900100%
5Beverages9900100%
6Seafood6510100%
7Oils, Condiments, and Specialty Foods12660100%
8Water and Water-Based Products4310100%
9Infant and Child Contact Products (Ages 0-5)8620100%
10Infant and Child Durable Goods and Textiles (Ages 0-5)121110100%
11Meat, Poultry, and Eggs8710100%
12Household Cleaning and Dishwashing202000100%
13Cosmetics and Personal Care — Leave-on151410100%
14Cosmetics and Personal Care — Rinse-off111100100%
15Feminine Care101000100%
16Dietary Supplements (Human)212100100%
17Pet Foods9900100%
18Pet Supplements7700100%
19Laundry and Fabric-Contact Home Products131300100%
20Oral Care10910100%
21Children’s Toys, Arts, and Crafts212100100%
22Home Air and Inhalation-Adjacent Products151500100%
23Food-Contact Consumer Goods and Kitchenware302910100%

Every taxonomy subcategory has a destination page on the wiki. Source-page declarations of products: [<row-slug>] route to the page they expect to land on; no source is dropped due to a missing destination.

Why this page matters

Cochrane systematic reviews open with a PRISMA flow diagram because the defensibility of every downstream claim depends on the reader understanding what was searched, what was screened, what was included, and on what grounds. The wiki’s defensibility argument (CLAUDE.md Part 1) rests on the same property: any hostile reader — a plaintiff’s expert, a regulatory reviewer, a competing standards body — must be able to see the universe the wiki is accountable to, not just the slice it has published.

Naming the gap is the work. The gap is not the failure mode; the failure mode is the gap being invisible.

  • methodology — full methodology page (source selection, evidence tiers, extraction protocol, HMT&C firewall)
  • editorial-standards — editorial conventions, writing style, audience layering
  • overview — high-level orientation for newcomers
  • synthesis — current best synthesis of the corpus across metals and matrices

Provenance and reproducibility

This page is auto-regenerated by tools/build-coverage-page.mjs from on-disk data. Inputs:

  • raw/manifest/triage-manifest.csv — the triage universe (immutable record of the literature search)
  • wiki/sources/*.md — promoted source pages (frontmatter: evidence_tier, source_type)
  • wiki/ingredients/, wiki/products/, wiki/metals/, wiki/regulations/, wiki/mitigation/, wiki/microbiome/, wiki/testing/
  • data/taxonomy/hmtc-v2.json — locked HMTc Comprehensive Testing Category Taxonomy v2.0
  • data/evidence/taxonomy-coverage-audit.csv — output of tools/taxonomy/audit-coverage.mjs
  • data/evidence/product_source_routing_audit.csv — output of tools/evidence/build-routing-audit.mjs

No hand-maintained tallies. Every count derives from on-disk frontmatter or structured-evidence files. If a count looks wrong, the underlying data is what changed, not this page’s narrative.