From the Lab

Insights on AgentOps, platform engineering, and shipping AI to production.

April 24, 2026

Memory, End-to-End: This Week in Omnia

Everyone on Reddit and X is shipping agentic memory this month. We've had basic memory in Omnia for a while; this week we cranked it up a bit more. Here's what landed between 2026-04-18 and 2026-04-24 — MemoryRetentionPolicy as a CRD, consent-revocation cascade, purpose-filtered retrieval, trust-aware redaction, summarisation-as-an-agent — and the facade auth chain that landed the same week, which is really the other half of the same story.

omniapromptkitstatus-updatememoryretentionauthkubernetes

April 17, 2026

Skills, End-to-End: This Week in Omnia and PromptKit

For the last month 'Skills' meant slightly different things in each repo. This week both halves finally met — PromptKit shipped the primitives that make a skill safe to load, Omnia shipped the CRD, reconciler, runtime logging and dashboard that make declaring one a one-line change. Here's the status update: what shipped between 2026-04-11 and 2026-04-17, why it matters, and the PromptKit v1.4.5 release that dropped alongside it.

omniapromptkitstatus-updateskillskubernetes

April 16, 2026

The First Rule of Fine-Tuning Is: You Don't Need to Fine-Tune

Fine-tuning isn't a model upgrade — it's a way of baking whatever data you already have into the model's wiring, permanently, in a way you can't edit afterwards. Here's what's actually happening inside, why LoRA and QLoRA made it cheap without changing anything about inference, and why the teams that win at it are the ones who did the unglamorous data work first.

llmsfine-tuningtrainingmachine-learninginfrastructure

April 14, 2026

The Two Families of Generative Inference: Autoregressive and Iterative Refinement

Every generative model in production today belongs to one of two architectural families. Text and music went autoregressive. Images and video went diffusion. Speech is a mess split across both. Here's how the two shapes differ, and why the choice settles almost every interesting operational question about the infrastructure underneath.

generative-aiinferencegpudiffusionllms

April 10, 2026

Progressive Rollouts for AI Agents: Canary, Blue/Green, and Experiments in Six Phases

Two months ago we wrote about why prompt changes need canaries. This week we shipped the real thing — an Istio-backed, session-aware rollout system for AgentRuntime, built in six phases. Here's how it works and what we learned building it.

kubernetesprogressive-deliveryagentopsistio

April 10, 2026

Bulletproofing Streaming LLM Calls: Three Layers of Back-Pressure

A single HTTP/2 reset can kill 100 concurrent LLM streams at once. Naively retrying them makes it worse. Here's the three-layer back-pressure stack we built in PromptKit — and the benchmark showing it kept us 6× more efficient than LangChain at 2000 concurrent.

goproductionobservabilityllms

April 10, 2026

What Actually Happens When You Call an LLM API

Inside the token-by-token generation loop, the KV cache, vLLM's PagedAttention, and why 'just retry the request' is harder than it looks when the API you're calling isn't stateless at all.

llmsinfrastructuregpuinference

March 15, 2026

Why 95% of AI Pilots Fail to Reach Production (And What to Do About It)

The barrier has shifted from AI technology to AI operations. Here's why most pilots die in deployment and what production AI actually requires.

agentopsenterprise-aiproduction

March 13, 2026

How Transformer Attention Actually Works: A Worked Example

Attention, embeddings, Q/K/V, softmax — walked through by hand with two-dimensional numbers a platform engineer can verify on the back of an envelope. No machine-learning background required.

llmsgpuinferencemachine-learning

March 10, 2026

The Klarna Effect: What Happens When You Scale AI Agents Without Measurement

Klarna's AI went from triumph to cautionary tale. Here's what every CX leader deploying AI in 2026 needs to learn from their journey.

case-studymeasurementcustomer-experience

March 5, 2026

Why Platform Engineers Are the Next AI Engineers

If you've spent five years building on Kubernetes, you already have 90% of the skills needed to operate AI agents in production. Here's why the 'AI skills gap' is mostly a tooling gap.

platform-engineeringkubernetesdevops

March 3, 2026

The Framework Lock-In Trap: Why Your AI Agent Platform Shouldn't Pick Sides

Most agent deployment platforms force you into a single framework. Here's why framework-agnostic infrastructure matters and how to avoid costly lock-in.

agentopsenterprise-aiplatform-engineering

March 1, 2026

Self-Hosted AI Agents: Why You Shouldn't Need an Enterprise Contract

Most AI agent platforms gate self-hosted deployment behind enterprise sales calls. Here's why that model is broken and what self-hosted infrastructure should actually look like.

enterprise-aikubernetescompliancesecurity

February 27, 2026

PromptPack: A Portable Standard for AI Agent Configuration

AI teams face the same configuration chaos that Docker solved for applications. PromptPack provides a portable, versioned standard for packaging AI agent prompts, tools, and configuration.

agentopsproductiondevops

February 25, 2026

Arena Fleet: Why AI Agents Need Unified Testing Infrastructure

AI agents require three types of testing -- load, evaluation, and data generation -- but most teams use fragmented tools. Here's why unified testing infrastructure changes the game.

testingagentopsproduction

February 23, 2026

Kubernetes-Native AI Agents: Why the CNCF Is Betting on K8s for AI

Serverless doesn't fit AI agent workloads. Here's why Kubernetes is emerging as the foundation for production AI agent infrastructure, backed by CNCF investments.

kubernetesplatform-engineeringagentops

February 21, 2026

Context-Based Isolation: Solving the Multi-Session AI Compliance Problem

Most AI agent platforms have no concept of compliance-grade session isolation. Here's why context-based isolation matters for regulated industries and how to implement it.

compliancegovernanceenterprise-aisecurity

February 19, 2026

Voice AI Agents: The Three Execution Modes You Need to Understand

Building voice AI agents for production requires choosing between VAD pipelines, native audio LLMs, and hybrid architectures. Here's how each mode works and when to use it.

voice-aicustomer-experienceproduction

February 17, 2026

MCP: The Universal Protocol for AI Agent Tool Integration

Every AI framework handles tool integration differently. The Model Context Protocol provides a single standard that works everywhere -- build once, use with any agent.

agentopsplatform-engineeringproduction

February 14, 2026

Observability for AI Agents: What Traditional APM Tools Miss

Your APM dashboard says everything is fine, but users say the AI is broken. Here's what AI-specific observability requires -- from conversation tracing to cost intelligence.

observabilityagentopsproductioncost-management

February 12, 2026

Go vs. Python for Production AI Agents: When Runtime Choice Matters

Python dominates the AI ecosystem, but production AI agents have infrastructure requirements that push teams toward Go. Here's the performance data and a practical hybrid approach.

productionagentopsplatform-engineering

February 10, 2026

Canary Deployments for AI Prompts: Reducing the Blast Radius of Prompt Changes

Prompt changes have 100% blast radius by default and fail quietly. Here's how canary deployments -- the same pattern that made code releases safer -- can protect your AI agents.

agentopsproductiondevops

February 8, 2026

Multi-Provider LLM Strategy: Why Betting on One Provider Is a Risk

Single-provider lock-in creates outage risk, cost inflexibility, and capability gaps. Here's how to build a multi-provider LLM strategy with practical routing and failover patterns.

enterprise-aiproductioncost-management

February 5, 2026

Red-Teaming AI Agents: Finding Failures Before Your Users Do

Normal testing proves AI agents work. Red-teaming proves how they fail. Here's how to build automated adversarial testing into your AI agent deployment pipeline.

testingsecurityproduction

February 3, 2026

Cost Intelligence for AI: Your Cloud Bill Doesn't Tell the Whole Story

Your cloud bill says you spent $80K on AI. That tells you almost nothing. Here's how to build application-level cost intelligence that actually enables decisions.

cost-managementobservabilitymeasurement

February 1, 2026

Cloud Agent Platforms Compared: AWS, Azure, Google, and the Open Alternative

Every major cloud provider now offers an AI agent platform. Here's an honest comparison of AWS Bedrock Agents, Azure AI Agent Service, Google Vertex AI, and cloud-agnostic alternatives.

enterprise-aiplatform-engineeringkubernetes

January 30, 2026

The AI Measurement Paradox: Why 79% Think It Works But Only 29% Can Prove It

Worldwide AI spending will hit $2.5 trillion in 2026, yet most enterprises can't prove their investments are paying off. Here's why measurement is the defining challenge of enterprise AI.

measuremententerprise-aicase-study

January 28, 2026

The Knowledge Codification Problem: Why Enterprise AI Is Stuck at Assist

The bottleneck for enterprise AI isn't model quality or infrastructure -- it's the inability to codify institutional knowledge into a form AI systems can execute. Here's how to break through.

enterprise-aiagentopsproduction

January 26, 2026

From Connectors to Capabilities: Why Your AI Agent Needs More Than API Access

MCP solved the connector problem for AI agents. But connecting to Zendesk isn't the same as knowing how to handle a customer escalation. The next abstraction layer is codified operational knowledge.

agentopsplatform-engineeringenterprise-ai

January 23, 2026

The Trust Plateau: Why 79% of Consumers Still Prefer Humans Over AI Agents

Consumer trust in AI agents remains stubbornly low despite massive enterprise investment. Here's why adoption is outpacing trust and how to build a deployment strategy that earns it.

customer-experienceenterprise-aimeasurement

January 21, 2026

AI Guardrails Stop Being Optional in 2026: What Your Agent Deployment Needs Now

The EU AI Act reaches full enforcement in August 2026. California's AI Transparency Act is already live. Here's what production-grade AI guardrails actually require.

compliancesecuritygovernance

January 19, 2026

Data Sovereignty and AI: Why Where Your Agent Runs Matters More Than Which Model It Uses

93% of executives now rank data sovereignty as their top technology governance concern. Here's why the physical location of AI inference has become a first-order architecture decision.

compliancesecurityenterprise-aikubernetes

January 17, 2026

The Integration Tax: Why Enterprises Need Six Tools to Run One AI Agent

Deploy one AI agent, watch six vendor contracts appear. The integration tax -- the cumulative cost of a fragmented AI stack -- is a primary driver of AI project failure.

platform-engineeringenterprise-aidevops

January 14, 2026

RAG in Production: Why 72% of Enterprise Implementations Fail in Year One

Most enterprise RAG implementations fail not because of model limitations but because of knowledge organization failures. Here are the five failure modes and what actually works.

productionenterprise-aitesting

January 12, 2026

The Agent Quality Crisis: Why AI-Generated Code Has 1.7x More Issues Than Human Code

AI-generated pull requests contain 1.7x more issues than human-written code, with 2.74x more XSS vulnerabilities. Speed was the 2025 story. Quality will be the 2026 reckoning.

testingproductionmeasurement

January 10, 2026

Why Your AI Agent Needs Memory: Building Persistent Relationships, Not Just Conversations

Every conversation with your AI agent starts from zero. Memory infrastructure -- episodic, semantic, and procedural -- is the layer that transforms transactional tools into trusted advisors.

agentopsproductioncustomer-experience

January 8, 2026

The SI Opportunity: How Consulting Firms Can Turn AI Expertise Into Recurring Revenue

The AI consulting market is $11-22B and growing, but every engagement produces a deliverable, not a product. Here's how SIs can productize domain expertise into reusable, deployable bundles.

enterprise-aiplatform-engineeringcase-study

January 5, 2026

Assist, Execute, Operate: A Practical Framework for AI Agent Maturity

40% of agentic AI projects will be canceled by 2027 because organizations skip maturity stages. Here's a three-level framework grounded in what the data shows actually works.

enterprise-aiagentopsmeasurement

January 3, 2026

The METR Paradox: When AI Tools Make Experienced Developers 19% Slower

A rigorous randomized controlled trial found AI coding tools made developers 19% slower despite believing they were 20% faster. The implications for how we measure AI ROI are profound.

measurementproductionenterprise-ai

January 1, 2026

Testing AI Agents at Scale: Why You Can't Ship What You Can't Measure

42% of AI initiatives failed in 2025 and 39% of AI bots were pulled back due to quality issues. The root cause: deploying systems that can't be adequately tested with traditional methods.

testingagentopsproduction

December 29, 2025

Enterprise AI in 2026: What's Real, What's Hype, and What's Next

The AI agent market is growing at 43% CAGR, but only 130 of thousands of vendors have genuine agent capabilities. Here's how to separate signal from noise.

enterprise-aiagentopsmeasurement

December 27, 2025

Building Customer Support Agents That Don't Embarrass Your Brand

AI support agents can resolve 55-70% of tier-1 queries at a fraction of human cost. But 39% of companies pulled back their bots in Q1 2025. Here's what separates success from brand damage.

customer-experienceenterprise-aiproduction

December 24, 2025

Reusable AI: Why Every Enterprise Implementation Should Produce a Product, Not Just a Project

42% of AI initiatives fail and each costs $6.8M on average. The root cause: every implementation starts from zero. Here's how to shift from project delivery to product delivery.

enterprise-aiplatform-engineeringagentops

December 22, 2025

Beyond Token Counts: The KPIs That Actually Prove Your AI Agent Works

79% of leaders perceive AI productivity gains but only 29% can measure ROI. Companies with AI-native KPIs see 3x the financial benefit -- yet only 34% have adopted them.

measuremententerprise-aiobservability