Hi there,
Anthropic just landed its first real banking deployment — not a pilot deck, not a proof of concept, but a production-bound AML agent co-built with FIS and heading into BMO and Amalgamated Bank by H2 2026. The headline is compelling: investigations compressed from days to minutes, fewer false positives, better SAR narratives. The production requirements underneath that headline are considerably harder to meet.
🔥 Featured Post
Anthropic's First Banking Agent Just Went Into AML. Here's the Production Architecture That Has to Hold.
- AML agents must assemble evidence across core banking systems, evaluate against typologies, and surface cases — all within a traceable, auditable chain that satisfies FinCEN and SR 26-2
- The "days to minutes" claim depends on retrieval latency from legacy core systems that were not designed for LLM query patterns
- SAR narrative generation by LLM is fluent but FinCEN requires specific format compliance — narrative drift is a real operational risk
- Reducing false positives without increasing false negatives is the hardest constraint in AML; getting one right while breaking the other is a regulatory failure
- The audit trail problem is unsolved: LLM reasoning chains are probabilistic, not deterministic, and bank examiners expect the former to look like the latter
📚 In Case You Missed It
The MCP Bloat Tax: How 72% Context Burn and Cross-Vendor Data Egress Are Breaking Enterprise Agent Economics — MCP context bloat is burning up to 72% of agent context windows before any real work begins, and now ServiceNow and Atlassian are metering cross-vendor agent data access — exposing the hidden bill of enterprise multi-agent orchestration.
The $40,000 Benchmark: When AI Evals Cost More Than Training, Enterprise Quality Gates Break — AI evaluation costs have crossed a threshold where a single agent benchmark run can cost $2,829 and a statistically reliable eval suite can run $320K — meaning enterprise teams can no longer afford the evals needed to verify the agents they're deploying.
The Multi-Model Control Problem: 78% of Enterprises Run AI Inference Without a Unified Management Plane — F5's 2026 State of Application Strategy Report shows 78% of enterprises now run their own AI inference with an average of seven models in production — but only 28% have a unified control plane to manage, route, and govern them.
More posts dropping every day. Stay curious.
— Bhanu @ superml.dev
