// AI Tangle
From AI PoC to Production
Gemini Flash undercuts Pro. Anthropic hardens Claude. Verizon flags AI patch windows. Modal raises $355M to place their own bets.

Five days. Five different stories. One signal — the deployment stack moved from PoC to production this week. Google priced inference for scale at I/O. Anthropic shipped the two enterprise features that were keeping Claude agents in pilot. Verizon's annual breach report confirmed that AI attackers compressed the patch window from months to hours. And Modal raised $355 million on $300 million in revenue, proving the infrastructure to actually run all of it is now a fundable category. The week's real takeaway: production-grade AI is now table stakes, and the cost-versus-quality and security-versus-velocity conversations both moved at the same time.
// The Big AI Story
Google's Flash is now smarter than Pro — and cheaper than the cheap model
At I/O on Tuesday, Google shipped a new generation of Gemini Flash to general availability at $1.50 per million input tokens and $9 per million output tokens, with a one-million-token context window. The smaller, faster model beats the previous Gemini Pro on coding and agentic benchmarks at four times the output speed. Pro itself is held for June — Flash is carrying the I/O announcement on its own.
The price point is the news. At a buck-fifty per million tokens, agentic workflows that were too expensive at production scale now pencil out. Customer support that wanted Claude or ChatGPT at flagship pricing can run on Flash. Document retrieval, contract review, code suggestions, lead enrichment — anything that was running at a thousand calls a day on a pilot can run at a million calls a day on a rollout without breaking the budget line. Google's bet is that the inference tier most enterprises actually buy is the cheap one, not the flagship, and they shipped Flash before Pro to make that point explicit.
Anthropic and OpenAI are now flanked on cost per quality at this tier. The strategic question for buyers is whether Flash's agent and coding benchmarks hold up on their actual workloads — independent evals over the next two weeks will tell. If they do, the cost-versus-quality conversation that's been gating enterprise AI deployments for eighteen months just shifted, and every pilot that was stuck at "the model's good enough but the unit economics don't work" gets a second look. The companies that ran cost-prohibitive Pro pilots last year now have a Flash-priced Gemini that may match their quality bar at a fifth the per-token cost — and the procurement teams that approved those pilots get to take the savings to their CFOs in the same quarter the deployment ships. Expect a wave of "we ran the eval, switched to Flash, here's the cost savings" posts on LinkedIn over the next month. Some will be honest. Most won't disclose the workloads where Flash didn't match — but the directional move is real, and the floor on per-token economics for serious agentic work just dropped by a multiple.
// The Number
45%. Percentage of employees using unsanctioned AI tools at work, according to Verizon's 2026 breach report — tripled year-over-year. Shadow AI is no longer fringe. It's how most knowledge workers actually do their jobs.
// 4 Quick Hits
1. Anthropic ships the two features Claude needed for production
At Code with Claude London on Tuesday, Anthropic added self-hosted sandboxes and MCP tunnels to Claude Managed Agents. Self-hosted sandboxes let tool execution run on customer infrastructure — or on Cloudflare, Daytona, Modal, or Vercel — so the agent's working memory never touches Anthropic's servers. MCP tunnels open an outbound-only encrypted gateway so agents can reach internal MCP servers without poking inbound firewall holes. Together they remove the two architectural objections that have been keeping Claude agents in pilot: data exfiltration risk and inbound network exposure. The CISOs and platform leads who were blocking production agent deployments now have a vendor answer to both — and the burden shifts back to the business owner asking why the pilot isn't in production yet. The features pair with the Gemini Flash news above: Google ships scale-economic inference, Anthropic ships the trust posture that lets enterprises use it. The deployment stack got two important pieces simultaneously this week.
2. Verizon's DBIR: AI shrinks the patch window from months to hours
Vulnerability exploitation overtook stolen credentials as the top breach entry point this year — 31 percent of breaches, the first time it's led in the nineteen-year history of Verizon's Data Breach Investigations Report. The shift is attributed to AI-augmented attackers compressing exploit development from months to hours. The patching SLA your security team was operating against last quarter is now obsolete on the math alone. Board talking point for the next quarterly review: how fast can we patch a CVE on day one of disclosure, and what's the budget to get that number down? And how do we factor shadow AI — tripled year-over-year to 45 percent of employees — into the threat model?
3. Modal closes $355M on $300M ARR for serverless GPUs
Modal Labs raised $355 million at a $4.65 billion post-money valuation on Thursday, led by Redpoint and General Catalyst. Revenue has quintupled to roughly $300 million ARR since September, driven by biotech, hedge fund, and weather-forecasting workloads that previously sat on hyperscaler GPU clusters. For engineering leaders evaluating AI infrastructure spend, serverless GPU is no longer the experimental choice — it's the comparable that AWS, GCP, and Azure are now defending against on price-per-second and cold-start latency. The customer mix is itself a signal — biotech and hedge funds are notoriously cost-sensitive on compute, and weather forecasting runs continuously rather than in bursts. If the workloads sticky enough for hedge funds are migrating off provisioned GPUs to serverless, the procurement default for general-purpose AI inference is being pulled along with them.
4. Trump pulls AI executive order before signing
The administration postponed a planned executive order that would have required AI vendors to share frontier models with the federal government 14 to 90 days before public release. David Sacks, Elon Musk, and Mark Zuckerberg argued the pre-release window would slow US AI velocity against China, and the President opted to hold. Enterprise AI buyers can plan against a status-quo deregulatory US environment near-term — and continue tracking EU AI Act compliance separately. Practical effect for legal and risk teams: the AI vendor diligence checklist gets one shorter for the next six months, and the EU compliance posture is still the binding constraint for any company shipping into both markets. The bigger signal is that the US is choosing speed over safety review at the policy level — which puts more weight on internal AI governance frameworks for any enterprise that was hoping federal review would do some of the diligence work.
// 3 AI Tools
Manus — Browser-resident AI agent that plans and runs multi-step knowledge-work tasks end-to-end: research a competitor, build a slide deck, send the follow-up emails, populate a CRM. The novel use most teams miss is recurring: set the same Manus run on a Friday cadence for the weekly competitive-intel brief, and the agent compiles it while you sleep, drops it in your inbox by Monday, and gets sharper every week as you tune the prompt. Right pick when a task has more than three sub-steps and a clear deliverable; wrong pick when judgment matters at every decision point, because the cost of letting a Manus run wrong is a wasted afternoon.
Lovable — Build full-stack web apps from a natural-language description: type what you want, get a working app with database, auth, and deployment in minutes. Beyond the obvious "build my SaaS" use, the highest-leverage move is the throwaway one-off — the regional sales dashboard your CEO asks for the night before a board meeting that you ship before the BI team replies to your Slack, or the customer-facing pricing calculator marketing has been waiting six months for IT to scope. One of the fastest-growing software companies of the past two years and the default pick for non-dev internal tooling.
Clay — Sales-ops platform that pulls signals from 150-plus data sources, enriches prospect lists with AI research agents, and triggers personalized outreach at scale. Reach for it on the loss side as much as the win side — run last quarter's dead pipeline through Clay enrichment to see which segments to stop wasting time on, often a better ROI than yet another outbound sequence — then redirect the freed-up SDR cycles to chase the segments the data actually says will convert. The category-defining tool for AI-augmented outbound sales; most B2B GTM teams either use it, are piloting it, or are about to lose deals to a competitor who does.
// The Extra Read
Anna Tong · TechCrunch · News · 3 min
Talent migration is the cleanest leading indicator of where the frontier is moving. Karpathy joining Anthropic puts a named researcher behind the same enterprise-trust thesis Anthropic shipped product for this week — worth reading alongside the Big Story.
// From The AIE Network
Event · All Things AI 2027 — Durham, NC, March 22–23, 2027
The annual gathering for AI practitioners, founders, and operators building on the production stack — speakers, workshops, and the working dinner the Durham AI community is known for. Early-bird tickets are open.
That's the week that was. If even half the stories above hold their shape over the next two weeks, the back half of 2026 looks materially different from the first.

Your AI Sherpa,
Mark R. Hinkle
Founding Publisher, The AIE Network
Follow me on LinkedIn
If you want to get in contact or give me feedback, reply to this email. I read every single one of them.
