P2 Batch 1 Ingest Report

Date: 2026-05-12 Tier: P2 — LOQ source candidates (488 handles in manifest) Sub-batches processed: p2-sub1 (50), p2-sub2 (50), p2-sub3 (50), p2-sub4 (50), remaining-group1 (55), remaining-group2 (55), remaining-group3 (55)


1. Summary

CategoryCount
P2 handles in manifest488
Sub-batch handles attempted (all 7 groups)~365 unique
Files accessible in raw/markdown~210
Files missing from filesystem (raw 2/ not yet Marker-converted)~175
Source pages created74
False positives skipped (out of scope)~132
Food concentration papers (P1-grade finds in P2 tier)4 primary
Analytical method / LOD-LOQ papers (true P2 content)~45
Environmental / exposure-context papers~25

Source page total (all tiers, cumulative through P2 batch 1): 264


2. P1-Grade Food Concentration Finds

Four papers misclassified P2 by the manifest text-mining heuristic contain primary food concentration data meeting HMT&C Path A criteria. All four were ingested and structured-evidence rows added to data/evidence/values.jsonl.

FM_11125852 — cantoral2024-lead-levels-mexican-foods

Reclassify: P2 → P1. Cantoral et al. 2024, Toxics 12(5):318. First systematic Pb monitoring across 103 foods and beverages in Mexico City retail. GF-AAS; LOQ 0.0025 mg/kg; duplicate analysis per sample. Key results:

  • Infant rice cereal (Brand 2): Pb 1,005 ppb wet weight — 5× FAO/WHO ML of 200 ppb; highest single-sample Pb value for that matrix in current wiki coverage.
  • Soy infant formula (Brand 2): Pb 35 ppb — 3.5× FAO/WHO ML of 10 ppb; 3 of 5 formula brands tested were <LOQ.
  • Pre-cooked rice: Pb 276 ppb — exceeds FAO/WHO ML 200 ppb.
  • Spices with detectable Pb: black pepper 239 ppb, turmeric 176 ppb, paprika 92 ppb (no FAO/WHO MLs for these matrices).
  • Overall: 19 of 103 items with detectable Pb; 4 exceeded FAO/WHO MLs.
  • Note: single retail purchase per brand; not a distribution estimate.

3 values.jsonl rows added: p2-infant-rice-cereal-cantoral2024-pb-single, p2-infant-formula-soy-cantoral2024-pb-single, p2-rice-grain-cantoral2024-pb-single.

FM_11617688 — tian2024-voltammetric-ias-rice

Reclassify: P2 → P1. Tian et al. 2024, Food Chemistry (LC-ICP/MS arsenic speciation in Chinese commercial rice). Anion exchange HPLC confirms iAs speciation: As(III) at 2.5 min, As(V) at 8.0 min. 36 samples from 5 Chinese provinces.

  • Mean iAs: 188 ppb dry weight
  • Max iAs: 345 ppb (Sample 22) — 1.7× China GB 2762 ML of 200 ppb
  • P90: 267 ppb dry weight
  • Range: 101–345 ppb; all 36 samples exceeded EU ML for polished rice (100 ppb); 3 samples exceeded China GB 2762.
  • Primary method: LC-ICP/MS (speciation confirmed); voltammetric method used for comparison only — iAs classification based on chromatographic speciation, not total As.

3 values.jsonl rows added: p2-rice-tian2024-ias-cn-mean, p2-rice-tian2024-ias-cn-max, p2-rice-tian2024-ias-cn-p90.

FM_11009735 — wehmeier2023-ias-rice-cola-field-method

Reclassify: P2 → P1. Wehmeier et al. 2023, Communication — iAs in 30 Austrian market rice products by HPLC-ICP-MS vs. field-deployable Cola extraction method.

  • Range: 60–249 ppb iAs dry weight across 30 products
  • Highest (R27, unpolished rice): 249 ppb — just below EU MCL of 250 ppb for unpolished rice
  • 22 of 30 samples would fail if the infant rice cereal EU MCL of 100 ppb were applied to all products
  • Method validation: Cola extraction compared to reference HPLC-ICP-MS for field deployment

2 values.jsonl rows added: p2-rice-wehmeier2023-ias-at-range-min, p2-rice-wehmeier2023-ias-at-range-max.

FM_12652890 — chiutula2025-wastewater-vegetables-malawi

Chiutula et al. 2025 — Cd, total Cr, Pb in wastewater-irrigated vegetables at Blantyre, Malawi. ICP-OES. Multiple FAO/WHO ML exceedances:

  • Total Cr up to 4,650 ppb (FAO/WHO ML 2,300 ppb for vegetables)
  • Cd up to 310 ppb (FAO/WHO ML 200 ppb for leafy vegetables)
  • Pb up to 4,090 ppb (FAO/WHO ML 300 ppb)
  • Note: total Cr only; paper does not speciate Cr-VI.

No values.jsonl rows added this pass (wastewater-irrigated matrix is geographically specific; queued for ingredient-level update when wastewater irrigation sub-profile is developed).


3. Analytical Method Papers (True P2 LOQ Content)

The P2 tier correctly identified approximately 45 analytical chemistry papers with validated LOD/LOQ data. These cover the main detection methods used in heavy metals food analysis:

Electrochemical sensors: Bozkurt 2025 (Pb, drinking water, portable voltammetric), Doan 2025 (MOF/rGO Pb sensor), Godja 2025 (Ni electrochemical), Jia 2025 (Au nanocluster simultaneous Pb/Cd), Ngok 2025 (ZnO-iron oxide-Au As(V)), Wang 2025 (MnO2-biochar Cd in rice), various others.

Optical / fluorescence sensors: Dhawale 2025 (benzidine chemosensor Hg in vegetable juice), Islam 2025 (AgNP colorimetric Hg), Chen 2024 (BODIPY Hg fluorescence in milk), Fei 2024 (Cd off-on fluorescence milk), Luo 2024 (Cd ADA-VBB food sensor), Zhao 2024 (ZnO-Si Cd fluorescence), various others.

SERS: Wang 2025 (nanogap SERS simultaneous Hg/Pb/Cd), Chepak 2023 (light-harvesting Hg nanoprobe), various others.

Mercury speciation (validated methods): Carter 2025 (FDA-validated TDA-AAS/SALLE for MeHg/tHg in finfish; LOD 3.8 ppb wet weight), Wu 2026 (whole-cell biosensor MeHg; LOD 0.04 nM), Yamashita 2024 (LAEP-OES Hg speciation in tuna), Qin 2025 (comparison AFS vs CAAS for Hg in soil).

Chromium-VI specific: Seesuan 2025 (DES-EDTA Cr-VI colorimetric), Wang 2025 (MFC Cr-VI sensor wastewater), Zandi 2025 (carbon quantum dots Cr-VI in coffee), Zheng 2025 (Pueraria carbon dots Cr-VI), Liu 2024 (Cr nanohorn Cr-VI water sensor), Ngok 2025 (As(V) sensor).

LOD highlights:

  • Carter 2025 (FDA TDA-AAS MeHg): LOD 3.8 ppb, LOQ 27 ppb (wet weight fish tissue) — government-validated method
  • Wu 2026 (whole-cell biosensor MeHg): LOD 0.04 nM (~8 ppb) in pure solution
  • Kayani 2025 (ratiometric Hg sensor): LOD 0.83 nM in water

Testing methodology pages flagged for creation (wiki/testing/): ICP-MS principles, arsenic speciation methods, mercury speciation methods. These stub pages would aggregate the LOD/LOQ data captured here.


4. Environmental and Exposure-Context Papers

Papers not reporting food concentrations but contributing supply-chain, environmental, or exposure context:

  • Bousquet 2024 (FM_11120698): Pb in drinking water at UNC-CH; 5,954 fixture tests; 8.43% >1 ppb LOD; max 1,100 ppb. Relevant for formula reconstitution water exposure pathway.
  • Zuhlke 2026 (FM_12947684 equivalent): Pb in US water kiosks; 15/20 kiosks >0.05 µg/L.
  • Rusko 2026 (FM_12984848 equivalent): Hg speciation in Latvian fish; risk-benefit framework.
  • Lepak 2025: MeHg correction factors for sport fish (Colorado); mercury method comparison.
  • Wang 2024 (FM_10970330): Environmental Pb from cemetery waste — soil pathway context.
  • Rodriguez-Rodriguez 2026: Sargassum biofertilizer and trace elements in tomato — supply-chain pathway (fertilizer → soil → crop).
  • Gundogdu 2025: Al in albumin infusion solutions — pharmaceutical exposure, not food; no product page update.
  • Kanazawa 2024: Hg speciation in ASGM communities (Kenya); artisanal gold mining supply chain context.

5. Additional Food Matrix Papers

Papers with food concentration data not meeting P1 threshold criteria but contributing evidence:

  • Wysok 2025: Pb mean 77 µg/kg, tAs mean 36 µg/kg in Polish sheep casings (n=not specified; A-tier analytical chemistry journal; useful for meat-derivatives context).
  • Altunay 2023: Cd in food samples, Turkey; voltammetric method validation with real samples.
  • Brzezinska-Rojek 2023: Heavy metals in beetroot supplements; dietary supplement safety context.
  • Silva 2023: Rice iAs and co-occurring mycotoxins in Portuguese market rice; iAs and Ochratoxin A co-contamination.
  • Sirisangarunroj 2023: Heavy metals in Thai fish; health risk assessment.
  • Kovacik 2024: Heavy metals in grass carp muscle; fish matrix.
  • Dogruyol 2024: Heavy metals in Mediterranean mussels; seafood matrix; health risk assessment.
  • Valizadeh 2023: Heavy metals in canned beans (Iran); legume matrix.
  • Naccari 2025: Heavy metals in honey; n unspecified; Al, As, Cd, Pb, multi-element.
  • Zhang 2024 (MIP-Pb): Pb detection in honey (Cyprus); sensor validation with food matrix.
  • Kim 2024: Metal migration from food containers (Korea); packaging pathway.
  • Wang 2025 (MOF-Bi-Cd): Cd soil-to-cup pathway in tea; combines soil and beverage measurement.
  • Yang 2024 (LIBS Cd Panax): Cd in Panax notoginseng (Chinese traditional medicine plant); not a consumer food product.

6. False Positives — Not Ingested

~132 papers skipped as out of scope for heavy metals in food. Common categories:

  • Bacteria / pathogen detection sensors (most common false positive): papers testing aptasensors, immunosensors, or colorimetric sensors for Bacillus cereus, Salmonella, Listeria, E. coli, etc. that mention metals only in sensor fabrication.
  • Mycotoxin sensors: Aflatoxin B1, Ochratoxin A, fumonisin detection (metal mentioned as electrode material).
  • Pesticide and herbicide sensors: Fenitrothion, chloramphenicol, organophosphate detection.
  • Bisphenols and plasticizers: BPA, BPS detection sensors.
  • Non-food water quality: Groundwater arsenic removal, wastewater chromium treatment (no food measurement).
  • Pharmaceutical / clinical: Drug metabolite sensors with no food connection.
  • Materials science only: SERS substrate fabrication, photocatalysis, no analytical application to food.

The manifest’s P2 text-mining heuristic catches LOD/LOQ language but has a ~35–40% precision rate for true heavy-metals-in-food papers. This is expected given how broadly “LOQ” language appears in analytical chemistry literature.


7. Missing Handles

~175 P2 handles from the manifest are absent from raw/markdown/. These handles appear to correspond to papers in the untracked raw 2/ directory (containing unMarker-converted PDFs). Until Marker conversion is run on raw 2/, these papers cannot be ingested.

Action required (Karen): Run Marker conversion on raw 2/raw/markdown/. After conversion, re-run P2 remaining handles against the updated filesystem.


8. New-Page Proposals

No new ingredient, product, or regulation pages proposed from P2 batch 1. Source-page frontmatter links existing pages. The wiki/testing/ stub pages for ICP-MS, arsenic speciation, and mercury speciation remain unresolved targets; surfacing as a standing proposal for Karen’s approval.


9. Values.jsonl Additions

8 new rows added (lines 550–557):

  • cantoral2024: Pb in infant rice cereal (MX, 1005 ppb), soy infant formula (MX, 35 ppb), rice grain (MX, 276 ppb)
  • tian2024: iAs in Chinese commercial rice — mean 188 ppb, max 345 ppb, p90 267 ppb
  • wehmeier2023: iAs in Austrian market rice — range min 60 ppb, range max 249 ppb

Total values.jsonl rows after P2 batch 1: 557 (was 549 after P1)


10. Commits

  • ffb0c50 — P2 sub-batch 1: 26 analytical-method source pages (2025–2026 sensor/biosensor)
  • 7dd9201 — P2 sub-batch 2: Kayani 2025 Hg ratiometric sensor
  • 38d88bd — P2 sub-batch 3: cantoral2024, bousquet2024, atanasov2024
  • c4878be — P2 sub-batch 4: 7 analytical-method source pages
  • 1d941b6 — P2 remaining-group1: chiutula2025 + 4 sensors
  • 3030042 — P2 remaining-group2: 23 source pages (sensors + food matrices)
  • 3f107ba — P2 remaining-group3: 9 source pages (food matrix papers)
  • (this commit) — P2 batch 1 close: values.jsonl +8 rows, batch report, log entry