Trust-block fail-closed contract (v0.11.73)

2026-05-10 · Engineering trust contract-change

Introduced: scam-intel addition, early v0.11.5x.   Detected: 2026-05-07 (homepage-rewrite review).   Fixed: v0.11.73 deployed 2026-05-07.   Disclosed: 2026-05-10.

Server v0.11.73 closes a silent-allow vector on /v1/trust-check that existed across the v0.11.5x → v0.11.72 patch window. When an upstream data source raised an exception, the previous contract emitted a response that omitted the affected factor with no flag; when all three sources raised simultaneously, it returned recommendation: "allow" with risk_score: 0 and an empty factors list — a silent allow on the gate that exists precisely to render allow / warn / block decisions. v0.11.73 makes both paths explicit: per-source raise emits a flagged factor with signal: "unreachable"; all-sources-unreachable forces recommendation: "warn". OFAC override is preserved at the top of the decision tree.

The vector

/v1/trust-check composes three upstream signals — OFAC SDN screening, anomaly heuristics, and a scam-intel aggregator that wraps GoPlus and Etherscan source-verification. Each signal can raise: rate limits, network blips, upstream maintenance windows, malformed responses. The pre-v0.11.73 implementation caught those raises and dropped the affected factor from the response. If only one source raised, you'd see two factors in the response instead of three with no indication the third had failed. If all three raised, you'd see zero factors with recommendation: "allow" and risk_score: 0.

That last shape — empty factors list with a no-risk verdict — is a silent allow on a trust gate. It rendered permission to proceed when the service had no data to evaluate. The vector existed since the scam-intel source was added in early v0.11.5x; it was masked in production because all three sources rarely fail simultaneously.

How it surfaced. Not from a customer report or an incident page. The vector was caught during a 3-adversary review of an unrelated homepage rewrite plan — a reviewer asked what compute_trust_block actually returns under exception conditions, and the trace landed on the silent-allow path. The honest blast-radius framing: paid /v1/trust-check calls that landed during a transient outage of one upstream source could have received the partial-drop shape; calls that landed during a simultaneous outage of all three could have received the silent-allow shape. The impact window is bounded by source uptime, not by total call volume. We do not have audit infrastructure that can enumerate per-call which historical response shape was emitted; building that is a separate followup. No customer report or external incident tied to this contract has reached us as of this writing.

The fix

The decision tree is now explicit at three layers:

The all-sources-unreachable shape — the one most affected by v0.11.73 because it's where the silent-allow used to land — is now:

{
  "risk_score": 0,
  "recommendation": "warn",
  "factors": [
    { "source": "ofac",            "signal": "unreachable",
      "details": "source unreachable", "real": false },
    { "source": "paladin.anomaly", "signal": "unreachable",
      "details": "source unreachable", "real": false },
    { "source": "scam_intel",      "signal": "unreachable",
      "details": "source unreachable", "real": false }
  ],
  "version": "1.1",
  "error": "RequestException"
}

Partial-unreachability — only one or two sources raised — does NOT escalate to warn on its own. The unreachable factor is included as above but contributes 0 to risk_score; the remaining sources compute the recommendation as they would have. Only the all-sources-raised path forces warn. Success-path factors do not carry a "real" field; the key only appears on unreachable factors as real: false.

Two adjacent fixes shipped in the same release. The public details field is now the static phrase "source unreachable" — never str(err), which previously could leak upstream API keys (Etherscan) through exception strings into the response body. And the build_outer_failure_block(err) helper centralizes the warn-with-three-flagged-factors shape (forced risk_score: 0, three unreachable factors, top-level error set to the exception class name) so the outer exception handlers on /v1/quote and /v1/trust-check can't drift apart on future contract changes. The internal TRUST_BLOCK_VERSION constant moved 1.0 → 1.1; the public response field is "version": "1.1" inside the trust block.

What it means for agent code

The schema is field-additive: no fields removed, one new signal enum value ("unreachable") added, version bumped to advertise the change. Clients that branch on recommendation and don't introspect individual factors continue working without modification.

The behavior change worth retesting: clients that hard-coded a path on recommendation === "allow" and assumed it implied "all sources evaluated successfully" should retest under partial-unreachable conditions. Pre-v0.11.73, those clients sometimes saw "allow" when one or two of the underlying sources had silently dropped; now they see "allow" only when the score computation actually ran. If they want stricter behavior, they can additionally inspect factors[*].signal !== "unreachable".

For customers using the published plugins @paladinfi/eliza-plugin-trust and @paladinfi/agentkit-actions: both packages shipped v0.1.1 documentation patches on 2026-05-10 describing the new contract. The v0.1.1 ships are README-only. Both plugins on prior server versions passed the trust block through to caller code without client-side defense, which means agents running between v0.11.5x and v0.11.72 against the trust-check endpoint received whatever shape the server emitted — including the silent-allow shape during a triple-outage. Customers wanting stricter behavior than the new server-side fail-closed default can add a client-side assertion such as factors.length === 3 && factors.every(f => f.signal !== "unreachable") before trusting an allow recommendation.

Discipline note

Trust gates have two failure modes: fail-open (return allow when the service can't evaluate) and fail-closed (refuse to render a verdict). Fail-open is cheaper at the edge cases — the agent's UX doesn't degrade during transient upstream outages — but it converts an evaluation gap into an implicit permission. For a service that exists to render permission decisions, fail-open is the wrong default. v0.11.73 chose fail-closed: explicit warn on missing signal, never silent allow on missing data.

The other observation worth keeping: the vector was caught by a 3-adversary review of a different artifact — a homepage rewrite. Three reviewers asked structural questions that crossed surface boundaries ("what does the response shape look like under exception conditions?") and one of them landed on the silent-allow path. The bug had been latent across 23 patch releases. Adversarial review caught it before customer impact was reported. The concrete process change: all decision-rendering endpoints now ship with exception-path test fixtures asserting non-allow under each per-source-raise and the full all-sources-raise condition (tests/v0.11.73/test_fail_closed.py, 14 unit tests). The gap the bug exploited — exception paths not being treated as part of the decision surface — is closed at the test level, not just the code level.

One operational consequence buyers should monitor: a transient all-source outage now forces warn on every trust-check during the window instead of returning a stale-best-effort verdict. That's correct security posture but a new availability cost that agent ops should expect. We have not observed an all-source outage in production to date.

Verify

Server version at swap.paladinfi.com/health should read 0.11.73. The trust-check endpoint contract is summarized at paladinfi.com/trust-check. Plugin updates are live at the npm links above. Questions or repro cases: dev@paladinfi.com.