Updates · Reports · Benchmarks · Research papers

Independent updates on AI governance, security, and the numbers behind them.

Regular notes on AI governance, agent security, and bias — backed by published reports, reproducible benchmarks, and a companion stream of original research papers.

Not pro-AI. Not anti-AI. Evidence-first.

Every issue is built around something verifiable — a fresh benchmark run, a public report, a reproducible experiment, or a working paper queued for submission. We cite primary sources, link to the data, and publish methodology alongside results. The goal is signal that holds up when the corporate press releases age out.

What We Cover

Four standing beats. Each issue picks one and reports against current evidence.

⚖️

Bias & Accountability

Fairness audits, hidden assumptions, dataset gaps, scoring systems, and whether bias is being reduced or simply renamed.

🛡️

Agent Security

What changes when AI agents access tools, accounts, files, browsers, infrastructure, and private workflows.

🧾

Governance Logs

Verifiable decision trails, audit layers, rollback records, signed event tokens, and policy-aware reasoning.

📊

Reports & Benchmarks

Reproducible benchmark runs with the data + methodology published alongside the results. No "trust us, it scored 9.4."

Latest Issues

Each issue links to its underlying report, benchmark dataset, or working paper.

Issue 002

The Coming Security Problem: Agents With Access

Benchmark — what changes when AI can read, write, move, trigger, and decide across connected tools.

Security
Issue 003

Bias Is a Governance Failure Stack

Field report — bias emerging from data, incentives, policies, and feedback loops (not just the model).

Bias

Research Papers

Original work supporting the newsletter — published, in review, or queued for submission. Citation-ready BibTeX is included alongside each PDF.

2026 · In review

Constitutional Multi-Agent Architectures: A Fleet Model

Trust-level isolation, intent-gate routing, and per-container constitutional rules. Submitted; results dataset open-source.

In review
2026 · Drafting

The Overclaim Problem: Detecting Invented Track Record in LLM Output

Deterministic post-scan for fabricated customer counts, success metrics, and industry-position claims in agent-generated text. Eval set in progress.

Drafting