Tiny Spoon

Big AI news, in small bites

GOVERNANCEOpenAI
OPENAIAUDITOR

OpenAI published two governance documents in a single day. The Frontier Governance Framework lays out how OpenAI says it will manage safety as models grow. The "shared playbook for trustworthy third-party evaluations" sets out what an external safety evaluation should disclose - what claim, what system, what tooling, what safeguards.

OpenAI is now writing its own RSP. Two documents. One day. The frontier-lab safety race just turned into a credentialing competition.

The Frontier Governance Framework commits OpenAI to update its own rules as models, evaluations, and regulation change. The Trustworthy Evaluations playbook says external assessors should describe: the claim being tested, the evaluation content, the exact system under test (model, reasoning setting, tool access, harness, safeguards). It's a structure - and a soft attack on whoever runs evals without disclosing harness and tool access (read: most public benchmarks).

For PMs: expect every frontier-lab vendor to publish a similar framework within 90 days. For execs: ask your AI vendor which framework they sign off on and which third-party assessor they use. For governance: this is self-regulation racing the EU AI Act.

▾ full brief & sources

Why this matters

  • First time OpenAI publishes a structured governance framework comparable to Anthropic's Responsible Scaling Policy.
  • Sets a public standard for what an "external evaluation" should disclose - the harness, the safeguards, the claim being tested.
  • Comes one day after Anthropic's $65B raise. Reads as OpenAI defending its safety credibility on a different axis than valuation.

🔍 What happened

  • May 29, 2026. OpenAI publishes two posts: "OpenAI's Frontier Governance Framework" and "A shared playbook for trustworthy third-party evaluations."
  • The Framework commits OpenAI to continuously update its rules as model capabilities, evaluation methods, and regulatory requirements develop.
  • The Evaluations playbook says any third-party safety eval should specify: the claim (compare systems? estimate capability ceiling? test safeguards?), the evaluation content, the system under test (model, reasoning setting, tool access, harness, safeguards).
  • Companion piece "Strengthening our safety ecosystem with external testing" links the framework to actual external partners.
  • Aligned to Preparedness Framework updates published earlier in the year.

💬 Smart takes

  • OpenAI (Frontier Governance Framework): the company commits to update the framework "to reflect advancements in model capabilities, evaluation methods, and regulatory developments."
  • OpenAI (Trustworthy Evaluations): "Third party assessors add an independent layer of evaluation alongside internal work, strengthening rigor and providing additional protections against self-confirmation."
  • GovAI commentary: third-party compliance reviews are how AI safety frameworks get teeth; voluntary commitments alone are theatre.
  • Skeptic: Anthropic's RSP set the template two years ago. OpenAI publishing its own now reads as catch-up - and the test is whether either lab actually pauses a deployment when their own framework says they should.

🧭 Where this goes

  1. Google DeepMind, xAI, and Mistral publish equivalent governance frameworks within 90 days.
  2. EU AI Office cites these OpenAI documents in its general-purpose-AI implementation guidance by Q3.
  3. Insurance and procurement contracts start referencing "the OpenAI Trustworthy Evaluations playbook" or "Anthropic RSP equivalent" as required vendor disclosures.
  4. First high-profile case where a lab violates its own framework - and what happens next sets the precedent.

🎯 Implication

  • For PMs building on frontier models: the framework gives you ammo. Ask the vendor which version of their framework gates the model you're shipping on.
  • For execs: require your AI vendor to disclose which third-party assessor evaluated the model you're deploying, against which claim.
  • For policy teams: the EU AI Act now has two opt-in standards (Anthropic RSP, OpenAI Frontier Governance) it can converge on without inventing one.