Transparency
Cross-check
COB’s cross-check pipeline asks a second (and sometimes third) independent model to re-render each drafted verse from the same source text, then measures how close the results are. Agreement is bucketed and surfaced on the verse. The point isn’t “AI consensus equals truth” — it’s that divergence is a useful flag. Low-agreement verses get extra review.
How it works
After GPT-5.4 drafts a verse, a second model is asked to translate the same source text (SBLGNT for NT, MT/WLC for OT) with the same prompt framework. The two renderings are compared with a semantic-agreement metric, not a string match: “Paul, a slave of Messiah Jesus” and “Paul, Jesus the Messiah’s slave” agree on meaning even though the surface forms differ.
The agreement score lands on a 0–1 scale. Verses fall into one of three buckets per METHODOLOGY.md:
Draft proceeds as-is. Divergences between models are noted as footnotes on the verse but don’t block publication.
Divergences surface as footnotes and, where they affect meaning, as alternative readings in lexical_decisions.alternatives on the verse YAML. The chosen rendering still ships; the alternatives stay visible.
The disagreement is escalated into a public issue on the cartha-open-bible repo. The verse isn’t shipped as a settled rendering until that issue is resolved — typically via a revision commit, a footnote, or both.
Run it yourself
The cross-check pipeline is a single script in the translation repo. Given the same source text, model, and prompt hash, it reproduces the same agreement number — so anyone can verify a verse claim.
Output is a JSON blob with the two renderings, the agreement score, and a list of divergences. The same inputs produce the same output bit-for-bit, so claims on this page can be falsified by rerunning.
What this is not
Three models agreeing on a rendering isn’t proof the rendering is correct. Large language models share training data, so they can share the same bias in the same direction. Cross-check agreement is a signal against hallucination, not a proof of scholarly soundness. The final trust layer is credentialed human review; until scholars sign individual verses, nothing on COB should be read as the last word.
See also
METHODOLOGY.md — full source pipeline, bucket rules, and reproducibility guarantees. tools/cross_check.py — the script itself.