Listen Novv

The DevSecOps Talks Podcast

Mattias Hemmingsson, Julien Bisconti and Andrey Devyatkin

0 Following
92 Views
Publish On: 07 May 2026 11:44 AM

This is the show by and for DevSecOps practitioners who are trying to survive information overload, get through marketing nonsense, do the right technology bets, help their organizations to deliver value, and last but not the least to have some fun. Tune in for talks about technology, ways of working, and news from DevSecOps. This show is not sponsored by any technology vendor and trying to be as unbiased as possible. We talk like no one is listening! For good or bad :) For more info, show notes, and discussion of past and upcoming episodes visit devsecops.fm

More Episodes from Mattias Hemmingsson, Julien Bisconti and Andrey Devyatkin

#100 - 100 Episodes Later: What Still Matters in DevSecOps

What changed between episode 1 and episode 100, and what stayed surprisingly constant? The hosts revisit infrastructure as code, observability, incident response, secrets, compliance, and supply chain security through the lens of six years of conversations. It is part retrospective, part editorial reset for what the next 100 episodes should focus on.

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 100
Duration: 2445
Publish On: 07 May 2026 11:44 AM
Share

#99 - AI SRE and the End of 3 AM On-Call

Could AI handle the worst parts of incident response before you even join the call? Mattias and Paulina talk with Birol Yildiz about AI-written status updates, fast root cause analysis, and the path from read-only help to autonomous fixes. They also explore why post-mortems and documentation may be some of the best places to start.

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 99
Duration: 2901
Publish On: 28 Apr 2026 11:05 AM
Share

#98 - Beyond AI SRE

Andrey shares the thinking behind Boris and the idea of going beyond AI SRE. The conversation covers the DevOps talent shortage, the coming squeeze on AI costs, why repeatable operational tasks are a strong fit for agents, and why customer data should stay in the customer's own AWS account.

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 98
Duration: 2230
Publish On: 21 Apr 2026 13:38 PM
Share

#97 - Shift Left, Get Hacked: Supply Chain Attacks Hit Devs

March 2026 made supply chain attacks feel a lot less theoretical, but what made these incidents different? The hosts discuss compromised publishing credentials, automatic execution hooks like post-install scripts and Python `.pth` files, and how both humans and security tools caught the malicious releases. They also talk through concrete ways to make developer environments harder to abuse.

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 97
Duration: 2136
Publish On: 15 Apr 2026 23:20 PM
Share

#96 - Keeping Platforms Simple and Fast with Joachim Hill-Grannec

This episode with Joachim Hill-Grannec asks: How do platforms bloat, and how do you keep them simple and fast with trunk-based dev and small batches? Which metrics prove it works—cycle time, uptime, or developer experience? Can security act as a partner that speeds delivery instead of a gate?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Summary

In this episode of DevSecOps Talks, Mattias speaks with Joachim Hill-Grannec, co-founder of Peltek, a boutique consulting firm specializing in high-availability, cloud-native infrastructure. Following up on a previous episode where Steve discussed cleaning up bloated platforms, Mattias and Joachim dig into why platforms get bloated in the first place and how platform teams should think when building from scratch. Their conversation spans cloud provider preferences, the primacy of cycle time, the danger of adding process in response to failure, and a strong argument for treating security and quality as enablers rather than gatekeepers.

Key Topics Platform Teams Should Serve Delivery Teams

Joachim frames the core question of platform engineering around who the platform is actually for. His answer is clear: the delivery teams are the client. Platform engineers should focus on making it easier for developers to ship products, not on making their own work more convenient.

He connects this directly to platform bloat. In his experience, many platforms grow uncontrollably because platform engineers keep adding tools that help the platform team itself: "Look, I spent this week to make my job this much faster." But Joachim pushes back on this instinct — the platform team is an amplifier for the organization, and every addition should be evaluated by whether it helps a product get to production faster and gives developers better visibility into what they are working on.

Choosing a Cloud Provider: Preferences vs. Reality

The conversation briefly explores cloud provider choices. Joachim says GCP is his personal favorite from a developer perspective because of cleaner APIs and faster response times, though he acknowledges Google's tendency to discontinue services unexpectedly. He describes AWS as the market workhorse — mature, solid, and widely adopted, comparing it to "the Java of the land." Azure gets the coldest reception; both acknowledge it has improved over time, but Joachim says he still struggles whenever he is forced to use it.

They observe that cloud choices are frequently made outside engineering. Finance teams, investors, and existing enterprise agreements often drive the decision more than technical fit. Joachim notes a common pairing: organizations using Google Workspace for productivity but AWS for cloud infrastructure, partly because the Entra ID (formerly Azure AD) integration with AWS Identity Center works more smoothly via SCIM than the equivalent Google Workspace setup, which requires a Lambda function to sync groups.

Measuring Platform Success: Cycle Time Above All

When Mattias asks how a team can tell whether a platform is actually successful, Joachim separates subjective and objective measures.

On the subjective side, he points to developer happiness and developer experience (DX). Feedback from delivery teams matters, even if surveys are imperfect.

On the objective side, his favorite metric is cycle time — specifically, the time from when code is ready to when it reaches production. He also mentions uptime and availability, but keeps returning to cycle time as the clearest indicator that a platform is helping teams deliver faster. This aligns with DORA research, which has consistently shown that deployment frequency and lead time for changes are strong predictors of overall software delivery performance.

Start With a Highway to Production

A major theme of the episode is that platforms should begin with the shortest possible route to production. Mattias calls this a "highway to production," and Joachim strongly agrees.

For greenfield projects, Joachim favors extremely fast delivery at first — commit goes to production, commit goes to production — even with minimal process. As usage and risk increase, teams can gradually add automation, testing, and safeguards. The critical thing is to keep the flow and then ask "how do we make those steps faster?" as you add them, rather than letting each new step slow down the pipeline unchallenged.

He also makes a strong case for tags and promotions over branch-based deployment, noting his instinctive reaction when someone asks "which branch are we deploying from?" is: "No branches — tags and promotions."

The Trap of Slowing Down After Failure

Joachim warns about a common and dangerous pattern: when a bug reaches production, the natural organizational reaction is not to fix the pipeline, but to add gates. A QA team does a full pass, a security audit is inserted, a manual review step appears. Each gate slows delivery, which leads to larger batches, which increases risk, which triggers even more controls.

He sees this as a vicious cycle. Organizations that respond to incidents by slowing delivery actually get worse security, worse quality, and worse throughput over time. He references a study — likely the research behind the book Accelerate by Nicole Forsgren, Jez Humble, and Gene Kim — showing that faster delivery correlates with better security and quality outcomes. The organizations adding Engineering Review Boards (ERBs) and Architecture Review Boards (ARBs) in the name of safety often do not measure the actual impact, so they never see that the controls are making things worse.

Mattias connects this to AI-assisted development, where developers can now produce changes faster than ever. If the pipeline cannot keep up, the pile of unreleased changes grows, making each release riskier.

Getting Buy-In: Start With Small Experiments

Joachim does not recommend that a slow, process-heavy organization throw everything out overnight. Instead, he suggests starting with small experiments. Code promotions are a good entry point: teams can start producing artifacts more rapidly without changing how those artifacts are deployed. Once that works, the conversation shifts to delivering those artifacts faster.

He finds starting on the artifact pipeline side produces quicker wins and more organizational buy-in than starting with the platform deployment side, which tends to be more intertwined and higher-risk to change.

Guiding Principles Over a Rigid Golden Path

Mattias questions the idea of a single "golden path," saying the term implies one rigid way of working. Joachim leans toward guiding principles instead.

His strongest principle is simplicity — specifically, simplicity to understand, not necessarily simplicity to create. He references Rich Hickey's influential talk Simple Made Easy (from Strange Loop 2011), which distinguishes between things that are simple (not intertwined) and things that are easy (familiar or close at hand). Creating simple systems is hard work, but the payoff is systems that are easy to reason about, easy to change, and easy to secure.

His second guiding principle is replaceability. When evaluating any tool in the platform, he asks: "How hard would it be to yank this out and replace it?" If swapping a component would be extremely difficult, that is a smell — it means the system has become too intertwined. Even with a tool as established as Argo CD, his team thinks about what it would look like to switch it out.

Tooling Choices and Platform Foundations

Joachim outlines the patterns his team typically uses when building platforms, organized into two paths:

Delivery pipeline (artifact creation): - Trunk-based development over GitFlow - Release tags and promotions rather than branch-based deployment - Containerization early in the pipeline - Release Please for automated release management and changelogs - Renovate for dependency updates (used for production environment promotions from Helm charts and container images)

Platform side (environment management): - Kubernetes-heavy, typically EKS on AWS - Karpenter for node scaling - AWS Load Balancer Controller only as a backing service for a separate ingress controller (not using ALB Ingress directly, due to its rough edges) - Argo CD for GitOps synchronization and deployment - Argo Image Updater for lower environments to pull latest images automatically - Helm for packaging, despite its learning curve

He notes that NGINX Ingress Controller has been deprecated, so teams need to evaluate alternatives for their ingress layer.

Developers Should Not Be Fully Shielded From Operations

One of the more nuanced parts of the conversation is how much operational responsibility developers should have. Joachim rejects both extremes. He does not think every developer needs to know everything about infrastructure, but he has seen too many cases where developers completely isolated from runtime concerns make poor decisions — missing simple code changes that would make a system dramatically easier to deploy and operate.

He advocates for transparency and collaboration. Platform repos should be open for anyone on the dev team to submit pull requests. When the platform team makes a change, they should pull in developers to work alongside them. This way, the delivery team gradually builds a deeper understanding of how the whole system works.

Joachim loves the open-source maintainer model applied inside organizations: platform teams are maintainers of their areas, but anyone in the organization should be able to introduce change. He warns against building custom CLIs or heavy abstractions that create dependencies — if a developer wants to do something the CLI does not support, the platform team becomes a bottleneck.

Mattias adds that opening up the platform to contributions also exposes assumptions. What feels easy to the person who built it may not be easy at all; it is just familiar. Outside contributors reveal where the system is actually hard to understand.

Designers, Not Artists: Detaching Ego From Code

Joachim shares an analogy he prefers over the common "developers as artists" framing. He sees developers more like designers than artists, because an artist's work is tied to their identity — they want it to endure. A designer, by contrast, creates something to serve a purpose and expects it to be replaced when something better comes along.

He applies this to platforms and infrastructure: "I want my thing to get wiped out. If I build something, I want it to get removed eventually and have something better replace it." Organizations where ego is tied to specific systems or tools tend to resist change, which leads to the kind of dysfunction that keeps platforms bloated and brittle.

Complexity Is the Enemy of Security

Mattias raises the difficulty of maintaining complex security setups over time, especially when the original experts leave. Joachim responds firmly: complexity is anti-security.

If people cannot comprehend a system, they cannot secure it well. He acknowledges that some problems are genuinely hard, but argues that much of the complexity engineers create is unnecessary — driven by ego rather than need. "The really smart people are the ones that create simple things," he says, wishing the industry would redirect its narrative from admiring complicated systems to admiring simple ones.

Security and QA as Internal Consulting, Not Gatekeeping

Joachim draws a parallel between security and QA. He dislikes calling a team "the quality team," preferring "verification" — they are one component of quality, not the entirety of it. Similarly, security is not one team's responsibility; it spans product design, development practices, tooling, and operations.

His ideal model is for security and QA teams to operate as internal consultants whose goal is to reduce risk and improve the overall system — not to catch every possible issue at any cost. The framing matters: if a security team's mandate is simply "block all security issues," the logical conclusion is to stop shipping or delete the product entirely. That may be technically secure, but it is useless.

He frames security as risk management: "Security is a risk management process, not just security for the sake of security. You're managing the risk to the business." The goal should be to deliver faster and more securely — an "and," not an "or."

Mattias recalls a PCI DSS consultant joking over drinks that a system being down is perfectly compliant — no one can steal card numbers if the system is unavailable. The joke lands because it exposes exactly the broken incentive Joachim describes.

Business Value as the Unifying Frame

The episode closes by tying everything back to business outcomes. Joachim argues that speed and security are not opposites; both contribute to business value. Fast delivery creates value directly, while security reduces business risk — and risk management is itself a business operation.

He explains why focusing on the highest-impact business bottleneck first builds trust. When you hit the big items first, you earn credibility, and subsequent changes become easier to justify. For example, one of his clients has a security group that is the slowest part of their organization. Speeding up that security process would have a massive impact on business delivery — more than optimizing the artifact pipeline.

Mattias reflects that he used to see platform work as separate from business concerns — "I don't care about the business, I'm here to build a platform for developers." Looking back, he would reframe that: using business impact as the measure of platform success does not mean abandoning the focus on developers, it means having a clearer way to prioritize and demonstrate value.

Highlights

Joachim on platform bloat: "Your job is not to make your job faster and easier — you're an amplifier to the organization."
Joachim on his favorite metric: "Cycle time is my favorite metric. I love cycle time metrics."
Joachim on deployment strategy: "No branches, no branches — tags and promotions."
Mattias on platform design: He calls the ideal early setup a "highway to production."
Joachim on simplicity vs. ease: He references Rich Hickey's Simple Made Easy talk — "It's very hard to create simple systems that are easy to reason about. And it's very easy to create systems that are very hard to reason about."
Joachim on replaceability: "If swapping a tool out would be extremely hard, that's a pretty big smell."
Joachim on complexity and security: "If it's complicated, you just can't keep all the context together. Simple systems are much easier to be secure."
Joachim on engineering ego: "I don't particularly like the aspect of [developers as] artists... I want my thing to get wiped out. I want it to get removed eventually and have something better replace it." He prefers the analogy of designers over artists, because artists tie their identity to their creations.
Joachim on security as a blocker: "If their goal is we are going to block every security issue, the best way to do that is delete your product."
Spicy cloud takes: Joachim calls GCP his favorite cloud for developers, compares AWS to "the Java of the land," and says he still struggles every time he is forced to use Azure.
PCI DSS dark humor: Mattias recalls a consultant joking that a downed system is perfectly compliant — you cannot steal card numbers from a system that is not running.
Joachim on the slow-down trap: Organizations add ERBs, ARBs, and manual security gates after incidents, but "the faster you can deliver, you actually get better security, better quality, and better throughput — and the more you slow it down, you go the opposite."

Resources

Simple Made Easy by Rich Hickey (InfoQ) — The influential 2011 talk Joachim references on distinguishing simplicity from ease in system design.
DORA Metrics: The Four Keys — The research framework behind cycle time, deployment frequency, and the finding that speed and stability are not tradeoffs.
Trunk Based Development — A comprehensive guide to the branching strategy Joachim recommends over GitFlow.
Argo CD — Declarative GitOps for Kubernetes — The GitOps tool Joachim's team uses for cluster synchronization and deployment.
Release Please (GitHub) — Google's tool for automated release management based on conventional commits, used by Joachim's team for tag-based promotions.
Karpenter — Kubernetes Node Autoscaler — The node autoscaler Joachim's team uses with EKS for fast, flexible scaling.
Renovate — Automated Dependency Updates — The dependency management bot Joachim uses for both build dependencies and production environment promotions.

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Summary

Key Topics Platform Teams Should Serve Delivery Teams

Choosing a Cloud Provider: Preferences vs. Reality

Measuring Platform Success: Cycle Time Above All

When Mattias asks how a team can tell whether a platform is actually successful, Joachim separates subjective and objective measures.

On the subjective side, he points to developer happiness and developer experience (DX). Feedback from delivery teams matters, even if surveys are imperfect.

Start With a Highway to Production

A major theme of the episode is that platforms should begin with the shortest possible route to production. Mattias calls this a "highway to production," and Joachim strongly agrees.

The Trap of Slowing Down After Failure

Getting Buy-In: Start With Small Experiments

Guiding Principles Over a Rigid Golden Path

Mattias questions the idea of a single "golden path," saying the term implies one rigid way of working. Joachim leans toward guiding principles instead.

Tooling Choices and Platform Foundations

Joachim outlines the patterns his team typically uses when building platforms, organized into two paths:

He notes that NGINX Ingress Controller has been deprecated, so teams need to evaluate alternatives for their ingress layer.

Developers Should Not Be Fully Shielded From Operations

Designers, Not Artists: Detaching Ego From Code

Complexity Is the Enemy of Security

Mattias raises the difficulty of maintaining complex security setups over time, especially when the original experts leave. Joachim responds firmly: complexity is anti-security.

Security and QA as Internal Consulting, Not Gatekeeping

Business Value as the Unifying Frame

Highlights

Joachim on platform bloat: "Your job is not to make your job faster and easier — you're an amplifier to the organization."
Joachim on his favorite metric: "Cycle time is my favorite metric. I love cycle time metrics."
Joachim on deployment strategy: "No branches, no branches — tags and promotions."
Mattias on platform design: He calls the ideal early setup a "highway to production."
Joachim on simplicity vs. ease: He references Rich Hickey's Simple Made Easy talk — "It's very hard to create simple systems that are easy to reason about. And it's very easy to create systems that are very hard to reason about."
Joachim on replaceability: "If swapping a tool out would be extremely hard, that's a pretty big smell."
Joachim on complexity and security: "If it's complicated, you just can't keep all the context together. Simple systems are much easier to be secure."
Joachim on engineering ego: "I don't particularly like the aspect of [developers as] artists... I want my thing to get wiped out. I want it to get removed eventually and have something better replace it." He prefers the analogy of designers over artists, because artists tie their identity to their creations.
Joachim on security as a blocker: "If their goal is we are going to block every security issue, the best way to do that is delete your product."
Spicy cloud takes: Joachim calls GCP his favorite cloud for developers, compares AWS to "the Java of the land," and says he still struggles every time he is forced to use Azure.
PCI DSS dark humor: Mattias recalls a consultant joking that a downed system is perfectly compliant — you cannot steal card numbers from a system that is not running.
Joachim on the slow-down trap: Organizations add ERBs, ARBs, and manual security gates after incidents, but "the faster you can deliver, you actually get better security, better quality, and better throughput — and the more you slow it down, you go the opposite."

Resources

Simple Made Easy by Rich Hickey (InfoQ) — The influential 2011 talk Joachim references on distinguishing simplicity from ease in system design.
DORA Metrics: The Four Keys — The research framework behind cycle time, deployment frequency, and the finding that speed and stability are not tradeoffs.
Trunk Based Development — A comprehensive guide to the branching strategy Joachim recommends over GitFlow.
Argo CD — Declarative GitOps for Kubernetes — The GitOps tool Joachim's team uses for cluster synchronization and deployment.
Release Please (GitHub) — Google's tool for automated release management based on conventional commits, used by Joachim's team for tag-based promotions.
Karpenter — Kubernetes Node Autoscaler — The node autoscaler Joachim's team uses with EKS for fast, flexible scaling.
Renovate — Automated Dependency Updates — The dependency management bot Joachim uses for both build dependencies and production environment promotions.

Episode #: 96
Duration: 2924
Publish On: 01 Apr 2026 21:50 PM
Share

#95 - From Platform Theater to Golden Guardrails with Steve Wade Is your Kubernetes stack bloated, slow, and hard to explain? Steve Wade shares simple checks—the hiring treadmill, onboarding time, and the acronym test—to spot platform theater fast. What would a 30-day deletion sprint cut, save, and secure?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Summary

In this episode of DevSecOps Talks, Mattias and Paulina speak with Steve Wade, founder of Platform Fix, about why so many Kubernetes and platform initiatives become overcomplicated, expensive, and painful for developers. Steve has helped simplify over 50 cloud-native platforms and estimates he has removed around $100 million in complexity waste. The conversation covers how to spot a bloated platform, why "free" tools are never really free, how to systematically delete what you don't need, and why the best platform engineering is often about subtraction rather than addition.

Key Topics Steve's Background: From Complexity Creator to Strategic Deleter

Steve introduces himself as the founder of Platform Fix — the person companies call when their Kubernetes migration is 18 months in, millions over budget, and their best engineers are leaving. He has done this over 50 times, and he is candid about why it matters so much to him: he used to be this problem.

Years ago, Steve led a migration that was supposed to take six months. Eighteen months later, the team had 70 microservices, three service meshes (they kept starting new ones without finishing the old), and monitoring tools that needed their own monitoring. Two senior engineers quit. The VP of Engineering gave Steve 90 days or the team would be replaced.

Those 90 days changed everything. The team deleted roughly 50 of the 70 services, ripped out all the service meshes, and cut deployment time from three weeks of chaos to three days, consistently. Six months later, one of the engineers who had left came back. That experience became the foundation for Platform Fix.

As Steve puts it: "While everyone's collecting cloud native tools like Pokemon cards, I'm trying to help teams figure out which ones to throw away and which ones to keep."

Why Platform Complexity Happens

Steve explains that organizations fall into a complexity trap by continuously adding tools without questioning whether they are actually needed. He describes walking into companies where the platform team spends 65–70% of their time explaining their own platform to the people using it. His verdict: "That's not a team, that's a help desk with infrastructure access."

People inside the complexity normalize it. They cannot see the problem because they have been living in it for months or years. Steve identifies several drivers: conference-fueled recency bias (someone sees a shiny tool at KubeCon and adopts it without evaluating the need), resume-driven architecture (engineers choosing tools to pad their CVs), and a culture where everyone is trained to add but nobody asks "what if we remove something instead?"

He illustrates the resume-driven pattern with a story from a 200-person fintech. A senior hire — "Mark" — proposed a full stack: Kubernetes, Istio, Argo, Crossplane, Backstage, Vault, Prometheus, Loki, Tempo, and more. The CTO approved it because "Spotify uses it, so it must be best practice." Eighteen months and $2.3 million later, six engineers were needed just to keep it running, developers waited weeks to deploy, and Mark left — with "led Kubernetes migration" on his CV. When Steve asked what Istio was actually solving, nobody could answer. It was costing around $250,000 to run, for a problem that could have been fixed with network policies.

He also highlights a telling sign: he asked three people in the same company how many Kubernetes clusters they needed and got three completely different answers. "That's not a technical disagreement. That's a sign that nobody's aligned on what the platform is actually for."

The AI Layer: Tool Fatigue Gets Worse

Paulina observes that the same tool-sprawl pattern is now being repeated with AI tooling — an additional layer of fatigue on top of what already exists in the cloud-native space. Steve agrees and adds three dimensions to the AI complexity problem: choosing which LLM to use, learning how to write effective prompts, and figuring out who is accountable when AI-written code does not work as expected. Mattias notes that AI also enables anyone to build custom tools for their specific needs, which further expands the toolbox and potential for sprawl.

How Leaders Can Spot a Bloated Platform

One of the most practical segments is Steve's framework for helping leaders who are not hands-on with engineering identify platform bloat. He gives them three things to watch for:

The hiring treadmill: headcount keeps growing but shipping speed stays flat, because all new capacity is absorbed by maintenance.
The onboarding test: ask the newest developer how long it took from their first day to their first production deployment. If it is more than a week, "it's a swamp." Steve's benchmark: can a developer who has been there two weeks deploy without asking anyone? If yes, you have a platform. If no, "you have platform theater."
The acronym test: ask the platform team to explain any tool of their choosing without using a single acronym. If they cannot, it is likely resume-driven architecture rather than genuine problem-solving.

The Sagrada Familia Problem

Steve uses a memorable analogy: many platforms are like the Sagrada Familia in Barcelona — they look incredibly impressive and intricate, but they are never actually finished. The question leaders should ask is: what does an MVP platform look like, what tools does it need, and how do we start delivering business value to the developers who use it? Because, as Steve says, "if we're not building any business value, we're just messing around."

Who the Platform Is Really For

Mattias asks the fundamental question: who is the platform actually for? Steve's answer is direct — the platform's customers are the developers deploying workloads to it. A platform without applications running on it is useless.

He distinguishes three stages: - Vanilla Kubernetes: the out-of-the-box cluster - Platform Kubernetes: the foundational workloads the platform needs to function (secret management, observability, perhaps a service mesh) - The actual platform: only real once applications are being deployed and business value is delivered

The hosts discuss how some teams build platforms for themselves rather than for application developers or the business, which is a fast track to unnecessary complexity.

Kubernetes: Standard Tool or Premature Choice?

The episode explores when Kubernetes is the right answer and when it is overkill. Steve emphasizes that he loves Kubernetes — he has contributed to the Flux project and other CNCF projects — but only when it is earned. He gives an example of a startup with three microservices, ten users, and five engineers that chose Kubernetes because "Google uses it" and the CTO went to KubeCon. Six months later, they had infrastructure that could handle ten million users while serving about 97.

"Google needs Kubernetes, but your Series B startup needs to ship features."

Steve also shares a recent on-site engagement where he ran the unit economics on day two: the proposed architecture needed four times the CPU and double the RAM for identical features. One spreadsheet saved the company from a migration that would have destroyed the business model. "That's the question nobody asks before a Kubernetes migration — does the maths actually work?"

Mattias pushes back slightly, noting that a small Kubernetes cluster can still provide real benefits if the team already has the knowledge and tooling. Paulina adds an important caveat: even if a consultant can deploy and maintain Kubernetes, the question is whether the customer's own team can realistically support it afterward. The entry skill set for Kubernetes is significantly higher than, say, managed Docker or ECS.

Managed Services and "Boring Is Beautiful"

Steve's recommendation for many teams is straightforward: managed platforms, managed databases, CI/CD that just works, deploy on push, and go home at 5 p.m. "Boring is beautiful, especially when you call me at 3 a.m."

He illustrates this with a company that spent 18 months and roughly $850,000 in engineering time building a custom deployment system using well-known CNCF tools. The result was about 80–90% as good as GitHub Actions. The migration to GitHub Actions cost around $30,000, and the ongoing maintenance cost was zero.

Paulina adds that managed services are not completely zero maintenance either, but the operational burden is orders of magnitude less than self-managed infrastructure, and the cloud provider takes on a share of the responsibility.

The New Tool Tax: Why "Free" Tools Are Never Free

A central theme is that open-source tools carry hidden costs far exceeding their license fee. Steve introduces the new tool tax framework with four components, using Vault (at a $40,000 license) as an example:

Learning tax (~$45,000): three engineers, two weeks each for training, documentation, and mistakes
Integration tax (~$20,000): CI/CD pipelines, Kubernetes operators, secret migration, monitoring of Vault itself
Operational tax (~$50,000/year): on-call, upgrades, tickets, patching
Opportunity tax (~$80,000): while engineers work on Vault, they are not building things that could save hundreds of hours per month

Total year-one cost: roughly $243,000 — a 6x multiplier over the $40,000 budget. And as Steve points out, most teams never present this full picture to leadership.

Mattias extends the point to tool documentation complexity, noting that anyone who has worked with Envoy's configuration knows how complicated it can be. Steve adds that Envoy is written in C — "How many C developers do you have in your organization? Probably zero." — yet teams adopt it because it offers 15 to 20 features that may or may not be useful.

This is the same total cost of ownership concept the industry has used for on-premises hardware, but applied to the seemingly "free" cloud-native landscape. The tools are free to install, but they are not free to manage and maintain.

Why Service Meshes Are Often the First to Go

When Mattias asks which tool type Steve most often deletes, the answer is service meshes. Steve does not name a specific product but says six or seven times out of ten, service meshes exist because someone thought they were cool, not because the team genuinely needed mutual TLS, rate limiting, or canary deploys at the mesh level.

Mattias agrees: in his experience, he has never seen an environment that truly required a service mesh. The demos at KubeCon are always compelling, but the implementation reality is different. Steve adds a self-deprecating note — this was him in the past, running three service meshes simultaneously because none of them worked perfectly and he kept starting new ones in test mode.

A Framework for Deleting Tools

Steve outlines three frameworks he uses to systematically simplify platforms.

The Simplicity Test is a diagnostic that scores platform complexity across ten dimensions on a scale of 0 to 50: tool sprawl, deployment complexity, cognitive load, operational burden, documentation debt, knowledge silos, incident frequency, time to production, self-service capability, and team satisfaction. A score of 0–15 is sustainable, 16–25 is manageable, 26–35 is a warning, and 36–50 is crisis. Over 400 engineers have taken it; the average score is around 34. Companies that call Steve typically score 38 to 45.

The Four Buckets categorize every tool: Essential (keep it), Redundant (duplicates something else — delete immediately), Over-engineered (solves a real problem but is too complicated — simplify it), or Premature (future-scale you don't have yet — delete for now).

From one engagement with 47 tools: 12 were essential, 19 redundant, 11 over-engineered, and 5 premature — meaning 35 were deletable.

He then prioritizes by impact versus risk, tackling high-impact, low-risk items first. For example, a large customer had Datadog, Prometheus, and New Relic running simultaneously with no clear rationale. Deleting New Relic took three hours, saved $30,000, and nobody noticed. Seventeen abandoned databases with zero connections in 30 days were deprecated by email, then deleted — zero responses, zero impact.

The security angle matters here too: one of those abandoned databases was an unpatched attack surface sitting in production with no one monitoring it. Paulina adds a related example — her team once found a Flyway instance that had gone unpatched for seven or eight years because each team assumed the other was maintaining it. As she puts it, lack of ownership creates the same kind of hidden risk.

The 30-Day Cleanup Sprint

Steve structures platform simplification as a focused 30-day effort:

Week 1: Audit. Discover what is actually running — not what the team thinks is running, because those are usually different.
Week 2: Categorize. Apply the four buckets and prioritize quick wins. During this phase, Steve tells the team: "For 30 days, you're not building anything — you're only deleting things."
Week 3: Delete. Remove redundant tools and simplify over-engineered ones.
Week 4: Systemize. Document everything — how decisions were made, why tools were removed, and how new tool decisions should be evaluated going forward. Build governance dashboards to prevent complexity from creeping back.

He illustrates this with a company whose VP of Engineering — "Sarah" — told him: "This isn't a technical problem anymore. This is a people problem." Two senior engineers had quit on the same day with the same exit interview: "I'm tired of fighting the platform." One said he had not had dinner with his kids on a weekend in six months. The team's morale score was 3.2 out of 10.

The critical insight: the team already knew what was wrong. They had known for months. But nobody had been given permission to delete anything. "That's not a cultural problem and it's not a knowledge problem. It's a permissions problem. And I gave them the permission."

Results: complexity score dropped from 42 to 26, monthly costs fell from $150,000 to $80,000 (roughly $840,000 in annual savings), and deployment time improved from two weeks to one day.

But Steve emphasizes the human outcome. A developer told him afterward: "Steve, I went home at 5 p.m. yesterday. It's the first time in eight months. And my daughter said, 'Daddy, you're home.'" That, Steve says, is what this work is really about.

Golden Paths, Guardrails, and Developer Experience

Mattias says he wants the platform he builds to compete with the easiest external options — Vercel, Netlify, and the like. If developers would rather go elsewhere, the internal platform has failed.

Steve agrees and describes a pattern he sees constantly: developers do not complain when the platform is painful — they route around it. He gives an example from a fintech where a lead developer ("James") needed a test environment for a Friday customer demo. The official process required a JIRA ticket, a two-day wait, YAML files, and a pipeline. Instead, James spun up a Render instance on his personal credit card: 12 minutes, deployed, did the demo, got the deal. Nobody knew for three months, until finance found the charges.

Steve's view: that is not shadow IT or irresponsibility — it is a rational response to poor platform usability. "The fastest path to business value went around the platform, not through it."

The solution is what Steve calls the golden path — or, as he reframes it using a bowling alley analogy, golden guardrails. Like the bumpers that keep the ball heading toward the pins regardless of how it is thrown, the guardrails keep developers on a safe path without dictating exactly how they get there. The goal is hitting the pins — delivering business value.

Mattias extends the guardrails concept to security: the easiest path should also be the most secure and compliant one. If security is harder than the workaround, the workaround wins every time. He aims to make the platform so seamless that developers do not have to think separately about security — it is built into the default experience.

Measuring Outcomes, Not Features

Steve argues that platform teams should measure developer outcomes, not platform features: time to first deploy, time to fix a broken deployment, overall developer satisfaction, and how secure and compliant the default deployment paths are.

He recommends monthly platform retrospectives where developers can openly share feedback. In these sessions, Steve goes around the room and insists that each person share their own experience rather than echoing the previous speaker. This builds a backlog of improvements directly tied to real developer pain.

Paulina agrees that feedback is essential but notes a practical challenge: in many organizations, only a handful of more active developers provide feedback, while the majority say they do not have time and just want to write code. Collecting representative feedback requires deliberate effort.

She also raises the business and management perspective. In her consulting experience, she has seen assessments include a third dimension beyond the platform team and developers: business leadership, who focus on compliance, security, and cost. Sometimes the platform enables fast development, but management processes still block frequent deployment to production — a mindset gap, not a technical one. Steve agrees and points to value stream mapping as a technique for surfacing these bottlenecks with data.

Translating Engineering Work Into Business Value

Steve makes a forceful case that engineering leaders must express technical work in business terms. "The uncomfortable truth is that engineering is a cost center. We exist to support profit centers. The moment we forget that, we optimize for architectural elegance instead of business outcomes — and we lose the room."

He illustrates this with a story: a CFO asked seven engineering leaders one question — "How long to rebuild production if we lost everything tomorrow?" Five seconds of silence. Ninety-four years of combined experience, and nobody could answer. "That's where engineering careers die."

The translation matters at every level. Saying "we deleted a Jenkins server" means nothing to a CFO. Saying "we removed $40,000 in annual costs and cut deployment failures by 60%" gets attention.

Steve challenges listeners to take their last three technical achievements and rewrite each one with a currency figure, a percentage, and a timeframe. "If you can't, you're speaking engineering, not business."

Closing Advice: Start Deleting This Week

Steve's parting advice is concrete: pick one tool you suspect nobody is using, check the logs, and if nothing has happened in 30 days, deprecate it. In 60 days, delete it. He also offers the simplicity test for free — it takes eight minutes, produces a 0-to-50 score with specific recommendations, and is available by reaching out to him directly.

"Your platform's biggest risk isn't technical — it's political. Platforms die when the CFO asks you a question you can't answer, when your best engineer leaves, or when the team builds for their CV instead of the business."

Highlights

Steve Wade: "While everyone's collecting cloud native tools like Pokemon cards, I'm trying to help teams figure out which ones to throw away and which ones to keep."
Steve Wade: "That's not a team, that's a help desk with infrastructure access." — on platform teams spending most of their time explaining their own platform
Steve Wade: "If the answer is yes, it means you have a platform. And if it's no, it means you have platform theater."
Steve Wade: "So many platforms are like the Sagrada Familia — they look super impressive, but they're never finished yet."
Steve Wade: "Boring is beautiful, especially when you call me at 3 a.m."
Steve Wade: "Google needs Kubernetes, but your Series B startup needs to ship features."
Steve Wade: "One spreadsheet saved them from a migration that would have just simply destroyed the business model."
Steve Wade: "They knew what was wrong. They'd known for months. But nobody was given the permission to delete anything."
Steve Wade: "They're not really free. They're just free to install." — on open-source CNCF tools
Steve Wade: "Your platform's power doesn't come from what you add. It comes from what you have the courage to delete."
Mattias: Argues that an internal platform should compete with the easiest external alternatives — if developers would rather use Vercel, the platform has failed. Also extends the guardrails concept to security: the easiest path should always be the most secure path.
Paulina: Highlights the ownership gap — tools can go unpatched for years when each team assumes another team maintains them. Also raises the management dimension: sometimes it is not the platform that is slow, but organizational processes that block deployment.

Resources

The Pragmatic CNCF Manifesto — Steve Wade's guide to cloud-native sanity, with six principles, pragmatic stack recommendations, and anti-patterns to avoid, drawn from 50+ enterprise migrations.
The Deletion Digest — Steve's weekly newsletter for platform leaders, delivering one actionable lesson on deleting platform complexity every Saturday morning.
CNCF Cloud Native Landscape — the full interactive map of cloud-native tools that Steve references when talking about needing to zoom out on your browser just to see everything.
How Unnecessary Complexity Gave the Service Mesh a Bad Name (InfoQ) — a detailed analysis of why service meshes became the poster child for over-engineering in cloud-native environments.
What Are Golden Paths? A Guide to Streamlining Developer Workflows — a practical guide to designing the "path of least resistance" that makes the right way the easy way.
Value Stream Mapping for Software Delivery (DORA) — the DORA team's guide to the value stream mapping technique Steve mentions for surfacing bottlenecks between engineering and business.
Inside Platform Engineering with Steve Wade (Octopus Deploy) — a longer conversation with Steve about platform engineering philosophy and practical approaches.
Steve Wade on LinkedIn — where Steve regularly posts about platform simplification and can be reached directly.

Is your Kubernetes stack bloated, slow, and hard to explain? Steve Wade shares simple checks—the hiring treadmill, onboarding time, and the acronym test—to spot platform theater fast. What would a 30-day deletion sprint cut, save, and secure?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Summary

Key Topics Steve's Background: From Complexity Creator to Strategic Deleter

As Steve puts it: "While everyone's collecting cloud native tools like Pokemon cards, I'm trying to help teams figure out which ones to throw away and which ones to keep."

Why Platform Complexity Happens

The AI Layer: Tool Fatigue Gets Worse

How Leaders Can Spot a Bloated Platform

One of the most practical segments is Steve's framework for helping leaders who are not hands-on with engineering identify platform bloat. He gives them three things to watch for:

The hiring treadmill: headcount keeps growing but shipping speed stays flat, because all new capacity is absorbed by maintenance.
The onboarding test: ask the newest developer how long it took from their first day to their first production deployment. If it is more than a week, "it's a swamp." Steve's benchmark: can a developer who has been there two weeks deploy without asking anyone? If yes, you have a platform. If no, "you have platform theater."
The acronym test: ask the platform team to explain any tool of their choosing without using a single acronym. If they cannot, it is likely resume-driven architecture rather than genuine problem-solving.

The Sagrada Familia Problem

Who the Platform Is Really For

The hosts discuss how some teams build platforms for themselves rather than for application developers or the business, which is a fast track to unnecessary complexity.

Kubernetes: Standard Tool or Premature Choice?

"Google needs Kubernetes, but your Series B startup needs to ship features."

Managed Services and "Boring Is Beautiful"

The New Tool Tax: Why "Free" Tools Are Never Free

Learning tax (~$45,000): three engineers, two weeks each for training, documentation, and mistakes
Integration tax (~$20,000): CI/CD pipelines, Kubernetes operators, secret migration, monitoring of Vault itself
Operational tax (~$50,000/year): on-call, upgrades, tickets, patching
Opportunity tax (~$80,000): while engineers work on Vault, they are not building things that could save hundreds of hours per month

Total year-one cost: roughly $243,000 — a 6x multiplier over the $40,000 budget. And as Steve points out, most teams never present this full picture to leadership.

Why Service Meshes Are Often the First to Go

A Framework for Deleting Tools

Steve outlines three frameworks he uses to systematically simplify platforms.

From one engagement with 47 tools: 12 were essential, 19 redundant, 11 over-engineered, and 5 premature — meaning 35 were deletable.

The 30-Day Cleanup Sprint

Steve structures platform simplification as a focused 30-day effort:

Week 1: Audit. Discover what is actually running — not what the team thinks is running, because those are usually different.
Week 2: Categorize. Apply the four buckets and prioritize quick wins. During this phase, Steve tells the team: "For 30 days, you're not building anything — you're only deleting things."
Week 3: Delete. Remove redundant tools and simplify over-engineered ones.
Week 4: Systemize. Document everything — how decisions were made, why tools were removed, and how new tool decisions should be evaluated going forward. Build governance dashboards to prevent complexity from creeping back.

Results: complexity score dropped from 42 to 26, monthly costs fell from $150,000 to $80,000 (roughly $840,000 in annual savings), and deployment time improved from two weeks to one day.

Golden Paths, Guardrails, and Developer Experience

Mattias says he wants the platform he builds to compete with the easiest external options — Vercel, Netlify, and the like. If developers would rather go elsewhere, the internal platform has failed.

Steve's view: that is not shadow IT or irresponsibility — it is a rational response to poor platform usability. "The fastest path to business value went around the platform, not through it."

Measuring Outcomes, Not Features

Translating Engineering Work Into Business Value

The translation matters at every level. Saying "we deleted a Jenkins server" means nothing to a CFO. Saying "we removed $40,000 in annual costs and cut deployment failures by 60%" gets attention.

Closing Advice: Start Deleting This Week

Highlights

Steve Wade: "While everyone's collecting cloud native tools like Pokemon cards, I'm trying to help teams figure out which ones to throw away and which ones to keep."
Steve Wade: "That's not a team, that's a help desk with infrastructure access." — on platform teams spending most of their time explaining their own platform
Steve Wade: "If the answer is yes, it means you have a platform. And if it's no, it means you have platform theater."
Steve Wade: "So many platforms are like the Sagrada Familia — they look super impressive, but they're never finished yet."
Steve Wade: "Boring is beautiful, especially when you call me at 3 a.m."
Steve Wade: "Google needs Kubernetes, but your Series B startup needs to ship features."
Steve Wade: "One spreadsheet saved them from a migration that would have just simply destroyed the business model."
Steve Wade: "They knew what was wrong. They'd known for months. But nobody was given the permission to delete anything."
Steve Wade: "They're not really free. They're just free to install." — on open-source CNCF tools
Steve Wade: "Your platform's power doesn't come from what you add. It comes from what you have the courage to delete."
Mattias: Argues that an internal platform should compete with the easiest external alternatives — if developers would rather use Vercel, the platform has failed. Also extends the guardrails concept to security: the easiest path should always be the most secure path.
Paulina: Highlights the ownership gap — tools can go unpatched for years when each team assumes another team maintains them. Also raises the management dimension: sometimes it is not the platform that is slow, but organizational processes that block deployment.

Resources

The Pragmatic CNCF Manifesto — Steve Wade's guide to cloud-native sanity, with six principles, pragmatic stack recommendations, and anti-patterns to avoid, drawn from 50+ enterprise migrations.
The Deletion Digest — Steve's weekly newsletter for platform leaders, delivering one actionable lesson on deleting platform complexity every Saturday morning.
CNCF Cloud Native Landscape — the full interactive map of cloud-native tools that Steve references when talking about needing to zoom out on your browser just to see everything.
How Unnecessary Complexity Gave the Service Mesh a Bad Name (InfoQ) — a detailed analysis of why service meshes became the poster child for over-engineering in cloud-native environments.
What Are Golden Paths? A Guide to Streamlining Developer Workflows — a practical guide to designing the "path of least resistance" that makes the right way the easy way.
Value Stream Mapping for Software Delivery (DORA) — the DORA team's guide to the value stream mapping technique Steve mentions for surfacing bottlenecks between engineering and business.
Inside Platform Engineering with Steve Wade (Octopus Deploy) — a longer conversation with Steve about platform engineering philosophy and practical approaches.
Steve Wade on LinkedIn — where Steve regularly posts about platform simplification and can be reached directly.

Episode #: 95
Duration: 2744
Publish On: 23 Mar 2026 23:17 PM
Share

#94 - Small Tasks, Big Wins: The AI Dev Loop at System Initiative We bring Paul Stack back to cover the parts we skipped last time. What changed when the models got better and we moved from one-shot Gen AI to agentic, human-in-the-loop work? How do plan mode and tight prompts stop AI from going rogue? Want to hear how six branches, git worktrees, and a TypeScript CLI came together?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Summary

In this episode, Mattias, Andre, and Paulina welcome back returning guest Paul from System Initiative to continue a conversation that started in the previous episode about their project Swamp. The discussion digs into how AI-assisted software development has changed over the past year, and why the real shift is not "AI writes code" but humans orchestrating multiple specialized agents with strong guardrails. Paul walks through the practical workflows, multi-layered testing, architecture-first thinking, cost discipline, and security practices his team has adopted — while the hosts push on how this applies across enterprise environments, mentoring newcomers, and the uncomfortable question of who is responsible when AI-built software fails.

Key Topics The industry crossroads: layoffs, fear, and a new reality

Before diving into technical specifics, Paul acknowledges that the industry is at "a real crazy crossroads." He references Block (formerly Square) cutting roughly 40% of their workforce, citing uncertainty about what AI means for their teams. He wants to be transparent that System Initiative also shrank — but clarifies the company did not cut people because of AI. The decision to reduce headcount came before they even knew what they were going to build next, let alone how they would build it. AI entered the picture only after they started prototyping the next version of their product.

Block's February 2026 layoffs, announced by CEO Jack Dorsey, eliminated over 4,000 positions. The move was framed as an AI-driven restructuring, making it one of the most visible examples of AI anxiety playing out in real corporate decisions.

From GenAI hype to agentic collaboration

Paul explains that AI coding quality shifted significantly around October–November of the previous year. Before that, results were inconsistent — sometimes impressive, often garbage. Then the models improved dramatically in both reasoning and code generation.

But the bigger breakthrough, in his view, was not the models themselves. It was the industry's shift from "Gen AI" — one-shot prompting where you hand over a spec and accept whatever comes back — to agentic AI, where the model acts more like a pair programmer. In that setup, the human stays in the loop, challenges the plan, adds constraints, and steers the result toward something that fits the codebase.

He gives a concrete early example: System Initiative had a CLI written in Deno (a TypeScript runtime). Because the models were well-trained on TypeScript libraries and the Deno ecosystem, they started producing decent code. Not beautiful, not perfectly architected — but functional. When Paul began feeding the agent patterns, conventions, and existing code to follow, the output became coherent with their codebase.

This led to a workflow where Paul would open six Claude Code sessions at once in separate Git worktrees — isolated copies of the repository on different branches — each building a small feature in parallel, feeding them bug reports and data, and continuously interacting with the results rather than one-shotting them.

Git worktrees let you check out multiple branches of the same repository simultaneously in separate directories. Each worktree is independent, so you can work on several features at once and merge them back via pull requests.

He later expanded this by running longer tasks on a Mac Mini accessible via Tailscale (a mesh VPN), while handling shorter tasks on his laptop — effectively distributing AI workloads across machines.

Why architecture matters more than ever

One of Paul's strongest themes is that AI shifts engineering attention away from syntax and back toward architecture. He argues that AI can generate plenty of code, but without design principles and boundaries it will produce spaghetti on top of existing spaghetti.

He introduces the idea of "the first thousand lines" — an anecdote he read recently claiming that the first thousand lines of code an agent helps write determine its path forward. If those lines are well-structured and follow clear design principles, the agent will build coherently on top of them. If they are messy and unprincipled, everything after will compound the mess.

Paul breaks software development into three layers:

Architecture — design patterns like DDD (Domain-Driven Design), CQRS (Command Query Responsibility Segregation)
Patterns — principles like DRY (Don't Repeat Yourself), YAGNI (You Aren't Gonna Need It), KISS (Keep It Simple)
Taste — naming conventions, module layout, project structure, Terraform module organization

He argues the industry spent the last decade obsessing over "taste" while often mocking "ivory tower architects" — the people who designed systems but didn't write code. In an AI-driven world, those architectural concerns become critical again because the agent needs clear boundaries, domain structure, and intent to produce coherent output.

Paulina agrees and observes that this trend may also blur traditional specialization lines, pushing engineers toward becoming more general "software people" rather than narrowly front-end, back-end, or DevOps specialists.

Encoding design docs, rules, and constraints into the repo

Paul describes how his team makes architecture actionable for AI by encoding system knowledge directly into the repository. Their approach has several layers:

Design documents — Detailed docs covering the model layer (the actual objects, their purposes, how they relate), workflow construction (how models connect and pass data), and expression language behavior. These live in a /design folder in the open-source repo and describe the intent of every part of the system.

Architectural rules — The agent is explicitly told to follow Domain-Driven Design: proper separation between domains, infrastructure, repositories, and output layers. The DDD skill is loaded so the agent understands and maintains bounded contexts.

Code standards — TypeScript strict mode, no any types, named exports, passing lint and format checks. License compliance is also enforced: because the project is AGPL v3, the agent cannot pull in dependencies with incompatible licenses.

Skills — A newer mechanism for lazy-loading contextual information into the AI agent. Rather than stuffing everything into one enormous prompt, skills are loaded on demand when the agent encounters a specific type of task. This keeps context windows lean and focused.

AGPL v3 (GNU Affero General Public License) is a copyleft license that requires anyone who runs modified software over a network to make the source code available. This creates strict constraints on what dependencies can be used.

Multi-agent development: the full chain

A major part of the discussion centers on how Paul's team works with multiple specialized AI agents rather than a single all-knowing assistant. The chain looks like this:

Issue triage agent — When a user opens a GitHub issue, an agent evaluates whether it is a legitimate feature request or bug report. The agent's summary is posted back to the issue immediately, creating context for later stages.
Planning agent — If the issue is legitimate, the system enters plan mode. A specification is generated and posted for the user to review. Users can push back ("that's not how I think it should work"), and the plan is revised until everyone agrees.
Implementation agent — The code is written based on the approved plan, with all the design docs, architectural rules, and skills loaded as context.
Happy-path reviewer — A separate agent reviews the code against standards, checking that it loads correctly and appears to function.
Adversarial reviewer — Added just days before the recording, this agent is told: "You are a grumpy DevOps engineer and I want you to pull this code apart." It looks for security injection points, failure modes, and anything the happy-path reviewer might miss.

Both review agents write their findings as comments on the pull request, creating a visible audit trail. The PR only merges when both agents approve. If the adversarial agent flags a security vulnerability, the implementation goes back for changes.

Paul says this "Jekyll and Hyde" review setup caught a path traversal bug in their CLI during its first week. While the CLI runs locally and the risk was limited, it proved the value of adversarial review.

Path traversal is a vulnerability where an attacker can access files outside the intended directory by manipulating file paths (e.g., using ../ sequences). Even in CLI tools, this can expose sensitive files on a user's machine.

Mattias compares the overall process to a modernized CI/CD pipeline — the same stages exist (commit, test, review, promote, release), but AI replaces some of the manual implementation steps while humans stay focused on architecture, review, and acceptance.

Why external pull requests are disabled

One of the more provocative decisions Paul describes: the open-source Swamp project does not accept external pull requests. GitHub recently added a feature to disable PR creation from non-collaborators entirely, and the team turned it on immediately.

The reasoning is supply chain control. Because the project's code is 100% AI-generated within a tightly controlled context — design docs, architectural rules, skills, adversarial review — they want to ensure that all code entering the system passes through the same pipeline. External PRs would bypass that chain of custody.

Contributors are instead directed to open issues. The team will work through the design collaboratively, plan it together, and then have their agents implement it. Paul frames this not as rejecting collaboration but as controlling the process: "We love contributions, but in the AI world, we cannot control where that code is from or what that code is doing."

Self-reporting bugs: AI filing its own issues

The team built a skill into Swamp itself so that when the tool encounters a bug during use, it can check out the version of the source code the binary was built against, analyze the problem, and open a GitHub issue automatically with detailed context.

This creates high-quality bug reports that already contain the information needed to reason about a fix. When the implementation agent later picks up that issue, it has precise context — where the bug is, what triggered it, and what the expected behavior should be. Paul says the quality of issues generated this way is significantly higher than typical user-filed bugs.

Testing: the favorite part

Although the conversation starts with code generation, Paul says testing is actually his favorite part of the workflow. The team runs multiple layers:

Product-level tests: - Unit and integration tests — standard code-level verification - Architectural fitness tests — contract tests, property tests, and DDD boundary checks that verify the domain doesn't leak and the agent followed its instructions

Architectural fitness tests are automated checks that verify a system's structure conforms to its intended architecture. In DDD, this means ensuring bounded contexts don't leak dependencies across domain boundaries.

User-level tests (separate repo): - User flow tests — written from the user's perspective against compiled binaries, not source code. These live in a different repository specifically so they are not influenced by how the code is written. They test scenarios like: create a repository, extend the system, create a workflow, run a model, handle wrong inputs.

Adversarial tests (multiple tiers): 1. Security boundary tests — path traversal, environment variable exposure, supply chain attack vectors. Paul references the recent Trivy incident, where a bot stole an API key and used it to delete all of Trivy's GitHub releases and publish a poisoned VS Code extension. 2. State corruption — what happens when someone tampers with the state layer 3. Concurrency — multiple writes, lock failures, race conditions 4. Resource exhaustion — handling pathological inputs like a 100MB stdout message injected into a workflow

Only after all these layers pass does a build get promoted from nightly to stable. Paul can download and manually test any nightly build that maps back to a specific commit.

Paulina points out that if AI is a force multiplier, there is now even less excuse not to write tests. Paul agrees: "We were scraping the barrel before at coming up with reasons why there shouldn't be any tests. Now that's eliminated."

Plan mode as a safety rail

Paul repeatedly emphasizes "plan mode," particularly in Claude Code. Before the agent changes anything, it produces a detailed plan describing what it intends to do and why, and waits for human approval.

The hosts immediately draw a parallel to terraform plan — the value is not just automation, but the chance to inspect intended changes before applying them. Paul says this was one of the biggest improvements in AI-assisted development because it reduces horror-story scenarios where an agent goes off and deletes a database or rewrites an application.

He notes that other tools are starting to adopt plan mode because it produces better results across the board. But he also warns that plan mode only helps if people actually read the plan — just like Terraform, the safeguard depends on human discipline. "If there's a big line in the middle that says 'I'm going to delete a database' and you haven't read it — it's the same thing."

Practical lessons for getting good results

Paul shares several tactical lessons:

Don't give mega-prompts — "Don't write 'rebuild me Slack' and expect an AI agent to do it. You'd be shocked at the amount of people who try this."
Always provide design docs — Specifications produce dramatically better output than vague instructions.
Don't skip straight to code — Start with design and planning, not "code me this method."
Small tasks — Don't attempt project-wide rewrites with a single agent. Context loss is the new "restart your machine."
Never trust the first plan — Paul routinely asks the agent: "Are you sure this is the right plan? Explain it to me like a five-year-old. Does this fulfill what the user needed?"
Compare implementation to plan — After implementation, ask the agent to map what it actually did against the original plan and explain any deviations.

He also notes that the generated TypeScript is not always how a human would write it — but that matters less if the result is well-tested, secure, and respects domain boundaries. "The actual syntax of the code itself can change 12 times a day. It doesn't really matter as long as it adheres to the product."

Human oversight at every stage

Despite all the automation, Paul is adamant that humans remain involved at every stage. Plans are reviewed, implementations are questioned, pull request comments are inspected, binaries are tested before reaching stable release. He describes it as "continually interacting with Claude Code" rather than just letting things happen.

When Paulina pushes on whether a human still checks things before production, Paul makes clear: yes, always. The release pipeline goes from commit to nightly to manual verification to stable promotion. "I will always download something before it goes to stable."

The context-switching tax

Paul acknowledges that running multiple agents in parallel is not for everyone. Context switching has always been expensive for engineers, and commanding multiple agents simultaneously is a new form of it. His advice: if you work best focusing on a single task, don't force the multi-agent style. "It'll be such a context switching killer and it'll cause you to lose focus."

The key shift is that instead of writing code, you are "commanding architecture and commanding design." But that still requires focus and judgment.

AI as a force multiplier, not a replacement

Paulina captures the dynamic bluntly: "It's a multiplier. If there is a good thing, you'll get a lot of good thing. If it's a shit, you're going to get a lot of shit."

Paul argues that experienced software and operations people are still essential because they understand architecture, security, constraints, and tradeoffs. AI amplifies whatever is already there — good engineering or bad engineering alike.

He believes engineers who learn to use these tools well become "even more important to your company than you already are." But he also acknowledges that some people will not want to work this way, and that friction between AI-forward and AI-resistant teams is already happening in organizations.

The challenge for juniors and newcomers

Paulina raises this personally — she was recently asked to mentor someone entering IT and struggled with how to approach it. She doesn't have a formal IT education (she has an engineering background) and learned on the go. The skills she built through manual work — understanding when code needs refactoring as scale changes, knowing how to structure projects at different sizes — are hard to teach when AI handles so much of the writing.

Paul agrees this is an open question and says the industry is still figuring out the patterns. He believes teaching principles, architecture, and core engineering fundamentals becomes even more important, because tool-specific syntax is increasingly handled by AI. "Do you need to know how to write a Terraform module? Do you need to know how to write a Pulumi provider?" — these are becoming less essential as individual skills, while understanding how systems fit together matters more.

He frames this as an opportunity: "We are now in control of helping shape how this moves forward in the industry." As innovators and early adopters, current practitioners can set the patterns. If they don't, someone else will.

Security, responsibility, and the risk of low-code AI

Paulina raises a concrete example from Poland: someone built an app using AI to upload receipts to a centralized accounting system, released it publicly, and exposed all their customers' data.

This leads to a deeper question from Mattias about responsibility: if someone with no engineering background builds an insecure app using an AI tool, who is accountable? The user? The platform? The model provider? The episode doesn't settle this, but Paul argues it reinforces why skilled engineers remain essential. The AI doesn't know the security boundary unless someone explicitly teaches it — "it probably wasn't fed that information that it had to think about the security context."

He expects more specialized skills and agents focused on security, accessibility, and compliance to emerge — calling out the example of loading a security skill and an accessibility skill when you know an app will be public-facing. But he says the ecosystem is not fully there yet.

Cost discipline: structure beats vibe coding

Paul addresses economics directly. His five-person team at System Initiative all use Claude Max Pro at $200 per person per month. They do not exceed that cost for the full AI workflow — code generation, reviews, planning, and adversarial testing.

In contrast, he has seen other organizations spend $10,000–$12,000 per month per developer on AI tokens because they let tools roam with huge context windows and vague instructions. His conclusion: tightly scoped tasks are not just better for quality — they are far cheaper.

This maps directly to classic engineering wisdom. Tightly defined stories and tasks were always more efficient to push through a system than "go rebuild this thing and I'll see you in six months." The same principle applies to AI agents.

How to introduce AI in cautious organizations

For teams in companies that ban or restrict AI, Paul suggests a pragmatic entry point: use agents to analyze, not to write code.

He describes a conversation with someone in London who asked how to get started. Paul's advice: if you already know roughly where a bug lives, ask the agent to analyze the same bug report. If it identifies the same area and the same root cause, you have evidence that the tool can accelerate diagnosis. Show your CTO: "I'm diagnosing bugs 50% faster with this agent. It's not writing code — it's helping me understand where the issue is."

Similar analysis-first use cases work for accessibility reviews, security scans, or code quality assessments. The point is to build trust before expanding scope. Paul notes this approach works faster in the private sector than the public sector, where technology adoption has always been slower.

The pace of change is accelerating

Paul believes the conversation has shifted dramatically in the past six months — from AI horror stories and commiserating over drinks to genuine success stories and conferences forming around agentic engineering practices. He points to two upcoming events:

An Agentic conference in Stockholm organized by Robert Westin, who Paulina mentions meeting just the day before
Agentic Conf in Hamburg at the end of March

His prediction: the pace is not linear. "We're honestly exponential at this moment in time." He sidesteps the ethics of AI companies (referencing tensions between Anthropic and OpenAI) to focus on the practical reality that models, reasoning, and tooling are all improving at a compounding rate.

Highlights

Paul on the industry mood: "We're at this real crazy crossroads in the industry."
Paul on model quality: Around late last year, "the models got really good, like really, really good. Really, really, really incredible."
Paul on running six agents at once: "I had my terminal open. I had six Claude Codes going at once, building like six different small features at a time."
Paulina's blunt summary: "It's a multiplier. If there is a good thing, you'll get a lot of good thing. If it's a shit, you're going to get a lot of shit."
Paul's spicy take on architecture: The industry spent years mocking "ivory tower architects," but now AI is pushing engineers back into "the architecture lab."
Paul on plan mode: It is the AI equivalent of terraform plan — useful, but only if people actually read it.
Paul's unusual open-source policy: External pull requests are disabled entirely so the team can control the supply chain.
Most memorable workflow detail: Paul's adversarial reviewer is instructed, "You are a grumpy DevOps engineer and I want you to pull this code apart."
Practical win: That grumpy reviewer caught a path traversal bug in its first week.
Paul on testing excuses: "We were scraping the barrel before at coming up with reasons why there shouldn't be any tests. Now that's eliminated."
Paul's warning against mega-prompts: Asking an agent to "rebuild me Slack" with no context is a great way to conclude, incorrectly, that "AI is garbage."
Paul on context loss: "Compact your conversation" is the new "restart your machine."
Paul on cost: Five people, $200/month each, covering the full AI workflow — versus orgs spending $10K+ per developer per month on unfocused vibe coding.
Paul on adoption strategy: Start by letting AI analyze bugs and accessibility issues — build trust before asking it to write code.

Resources

Claude Code Documentation — Common Workflows — Official Anthropic docs covering plan mode, extended thinking, and agentic workflows in Claude Code.
Anthropic Skills Repository — Public repository of modular Markdown-based skill packages that extend Claude's capabilities for specialized tasks like security review or accessibility checking.
System Initiative on GitHub — The open-source repository for System Initiative's infrastructure automation platform, including the design documents Paul references.
Git Worktree Documentation — Official docs for git worktree, the feature that enables Paul's multi-branch parallel development workflow.
Trivy Supply Chain Attack Analysis (Socket) — Detailed writeup of the incident Paul references where a bot stole credentials, deleted GitHub releases, and published a poisoned VS Code extension.
Building Evolutionary Architectures — Fitness Functions — The book and concepts behind architectural fitness tests that Paul's team uses to verify DDD boundaries and prevent domain leakage.
GitHub: New Repository Settings for Pull Request Access — The GitHub feature Paul mentions for disabling external pull requests at the repository level.

We bring Paul Stack back to cover the parts we skipped last time. What changed when the models got better and we moved from one-shot Gen AI to agentic, human-in-the-loop work? How do plan mode and tight prompts stop AI from going rogue? Want to hear how six branches, git worktrees, and a TypeScript CLI came together?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Summary

Key Topics The industry crossroads: layoffs, fear, and a new reality

From GenAI hype to agentic collaboration

Why architecture matters more than ever

Paul breaks software development into three layers:

Architecture — design patterns like DDD (Domain-Driven Design), CQRS (Command Query Responsibility Segregation)
Patterns — principles like DRY (Don't Repeat Yourself), YAGNI (You Aren't Gonna Need It), KISS (Keep It Simple)
Taste — naming conventions, module layout, project structure, Terraform module organization

Encoding design docs, rules, and constraints into the repo

Paul describes how his team makes architecture actionable for AI by encoding system knowledge directly into the repository. Their approach has several layers:

Multi-agent development: the full chain

A major part of the discussion centers on how Paul's team works with multiple specialized AI agents rather than a single all-knowing assistant. The chain looks like this:

Issue triage agent — When a user opens a GitHub issue, an agent evaluates whether it is a legitimate feature request or bug report. The agent's summary is posted back to the issue immediately, creating context for later stages.
Planning agent — If the issue is legitimate, the system enters plan mode. A specification is generated and posted for the user to review. Users can push back ("that's not how I think it should work"), and the plan is revised until everyone agrees.
Implementation agent — The code is written based on the approved plan, with all the design docs, architectural rules, and skills loaded as context.
Happy-path reviewer — A separate agent reviews the code against standards, checking that it loads correctly and appears to function.
Adversarial reviewer — Added just days before the recording, this agent is told: "You are a grumpy DevOps engineer and I want you to pull this code apart." It looks for security injection points, failure modes, and anything the happy-path reviewer might miss.

Why external pull requests are disabled

Self-reporting bugs: AI filing its own issues

Testing: the favorite part

Although the conversation starts with code generation, Paul says testing is actually his favorite part of the workflow. The team runs multiple layers:

Only after all these layers pass does a build get promoted from nightly to stable. Paul can download and manually test any nightly build that maps back to a specific commit.

Plan mode as a safety rail

Practical lessons for getting good results

Paul shares several tactical lessons:

Don't give mega-prompts — "Don't write 'rebuild me Slack' and expect an AI agent to do it. You'd be shocked at the amount of people who try this."
Always provide design docs — Specifications produce dramatically better output than vague instructions.
Don't skip straight to code — Start with design and planning, not "code me this method."
Small tasks — Don't attempt project-wide rewrites with a single agent. Context loss is the new "restart your machine."
Never trust the first plan — Paul routinely asks the agent: "Are you sure this is the right plan? Explain it to me like a five-year-old. Does this fulfill what the user needed?"
Compare implementation to plan — After implementation, ask the agent to map what it actually did against the original plan and explain any deviations.

Human oversight at every stage

The context-switching tax

The key shift is that instead of writing code, you are "commanding architecture and commanding design." But that still requires focus and judgment.

AI as a force multiplier, not a replacement

Paulina captures the dynamic bluntly: "It's a multiplier. If there is a good thing, you'll get a lot of good thing. If it's a shit, you're going to get a lot of shit."

The challenge for juniors and newcomers

Security, responsibility, and the risk of low-code AI

Paulina raises a concrete example from Poland: someone built an app using AI to upload receipts to a centralized accounting system, released it publicly, and exposed all their customers' data.

Cost discipline: structure beats vibe coding

How to introduce AI in cautious organizations

For teams in companies that ban or restrict AI, Paul suggests a pragmatic entry point: use agents to analyze, not to write code.

The pace of change is accelerating

An Agentic conference in Stockholm organized by Robert Westin, who Paulina mentions meeting just the day before
Agentic Conf in Hamburg at the end of March

Highlights

Paul on the industry mood: "We're at this real crazy crossroads in the industry."
Paul on model quality: Around late last year, "the models got really good, like really, really good. Really, really, really incredible."
Paul on running six agents at once: "I had my terminal open. I had six Claude Codes going at once, building like six different small features at a time."
Paulina's blunt summary: "It's a multiplier. If there is a good thing, you'll get a lot of good thing. If it's a shit, you're going to get a lot of shit."
Paul's spicy take on architecture: The industry spent years mocking "ivory tower architects," but now AI is pushing engineers back into "the architecture lab."
Paul on plan mode: It is the AI equivalent of terraform plan — useful, but only if people actually read it.
Paul's unusual open-source policy: External pull requests are disabled entirely so the team can control the supply chain.
Most memorable workflow detail: Paul's adversarial reviewer is instructed, "You are a grumpy DevOps engineer and I want you to pull this code apart."
Practical win: That grumpy reviewer caught a path traversal bug in its first week.
Paul on testing excuses: "We were scraping the barrel before at coming up with reasons why there shouldn't be any tests. Now that's eliminated."
Paul's warning against mega-prompts: Asking an agent to "rebuild me Slack" with no context is a great way to conclude, incorrectly, that "AI is garbage."
Paul on context loss: "Compact your conversation" is the new "restart your machine."
Paul on cost: Five people, $200/month each, covering the full AI workflow — versus orgs spending $10K+ per developer per month on unfocused vibe coding.
Paul on adoption strategy: Start by letting AI analyze bugs and accessibility issues — build trust before asking it to write code.

Resources

Claude Code Documentation — Common Workflows — Official Anthropic docs covering plan mode, extended thinking, and agentic workflows in Claude Code.
Anthropic Skills Repository — Public repository of modular Markdown-based skill packages that extend Claude's capabilities for specialized tasks like security review or accessibility checking.
System Initiative on GitHub — The open-source repository for System Initiative's infrastructure automation platform, including the design documents Paul references.
Git Worktree Documentation — Official docs for git worktree, the feature that enables Paul's multi-branch parallel development workflow.
Trivy Supply Chain Attack Analysis (Socket) — Detailed writeup of the incident Paul references where a bot stole credentials, deleted GitHub releases, and published a poisoned VS Code extension.
Building Evolutionary Architectures — Fitness Functions — The book and concepts behind architectural fitness tests that Paul's team uses to verify DDD boundaries and prevent domain leakage.
GitHub: New Repository Settings for Pull Request Access — The GitHub feature Paul mentions for disabling external pull requests at the repository level.

Episode #: 94
Duration: 3162
Publish On: 11 Mar 2026 23:18 PM
Share

#93 - The DevSecOps Perspective: Key Takeaways From Re:Invent 2025

Andrey and Mattias share a fast re:Invent roundup focused on AWS security. What do VPC Encryption Controls, post-quantum TLS, and org-level S3 block public access change for you? Which features should you switch on now, like ECR image signing, JWT checks at ALB, and air-gapped AWS Backup? Want simple wins you can use today?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Summary

In this episode, Andrey and Mattias deliver a security-heavy recap of AWS re:Invent 2025 announcements, while noting that Paulina is absent and wishing her a speedy recovery. Out of the 500+ releases surrounding re:Invent, they narrow the list down to roughly 20 features that security-conscious teams can act on today — covering encryption, access control, detection, backups, container security, and organization-wide guardrails. Along the way, Andrey reveals a new AI-powered product called Boris that watches the AWS release firehose so you don't have to.

Key Topics AWS re:Invent Through a Security Lens

The hosts frame the episode as the DevSecOps Talks version of a re:Invent recap, complementing a FivexL webinar held the previous month. Despite the podcast's name covering development, security, and operations, the selected announcements lean heavily toward security. Andrey is upfront about it: if security is your thing, stay tuned; otherwise, manage your expectations.

At the FivexL webinar, attendees were asked to prioritize areas of interest across compute, security, and networking. AI dominated the conversation, and people were also curious about Amazon S3 Vectors — a new S3 storage class purpose-built for vector embeddings used in RAG (Retrieval-Augmented Generation) architectures that power LLM applications. It is cost-efficient but lacks hybrid search at this stage.

VPC Encryption and Post-Quantum Readiness

One of the first and most praised announcements is VPC Encryption Control for Amazon VPC, a pre-re:Invent release that lets teams audit and enforce encryption in transit within and across VPCs. The hosts highlight how painful it used to be to verify internal traffic encryption — typically requiring traffic mirroring, spinning up instances, and inspecting packets with tools like Wireshark. This feature offers two modes: monitor mode to audit encryption status via VPC flow logs, and enforce mode to block unencrypted resources from attaching to the VPC.

Mattias adds that compliance expectations are expanding. It used to be enough to encrypt traffic over public endpoints, but the bar is moving toward encryption everywhere, including inside the VPC. The hosts also call out a common pattern: offloading SSL at the load balancer and leaving traffic to targets unencrypted. VPC encryption control helps catch exactly this kind of blind spot.

The discussion then shifts to post-quantum cryptography (PQC) support rolling out across AWS services including S3, ALB, NLB, AWS Private CA, KMS, ACM, and Secrets Manager. AWS now supports ML-KEM (Module Lattice-Based Key Encapsulation Mechanism), a NIST-standardized post-quantum algorithm, along with ML-DSA (Module Lattice-Based Digital Signature Algorithm) for Private CA certificates.

The rationale: state-level actors are already recording encrypted traffic today in a "harvest now, decrypt later" strategy, betting that future quantum computers will crack current encryption. Andrey notes that operational quantum computing feels closer than ever, making it worthwhile to enable post-quantum protections now — especially for sensitive data traversing public networks.

S3 Security Controls and Access Management

Several S3-related updates stand out. Attribute-Based Access Control (ABAC) for S3 allows access decisions based on resource tags rather than only enumerating specific actions in policies. This is a powerful way to scope permissions — for example, granting access to all buckets tagged with a specific project — though it must be enabled on a per-bucket basis, which the hosts note is a drawback even if necessary to avoid breaking existing security models.

The bigger crowd-pleaser is S3 Block Public Access at the organization level. Previously available at the bucket and account level, this control can now be applied across an entire AWS Organization. The hosts call it well overdue and present it as the ultimate "turn it on and forget it" control: in 2026, there is no good reason to have a public S3 bucket.

Container Image Signing

Amazon ECR Managed Image Signing is a welcome addition. ECR now provides a managed service for signing container images, leveraging AWS Signer for key management and certificate lifecycle. Once configured with a signing rule, ECR automatically signs images as they are pushed. This eliminates the operational overhead of setting up and maintaining container image signing infrastructure — previously a significant barrier for teams wanting to verify image provenance in their supply chains.

Backups, Air-Gapping, and Ransomware Resilience

AWS Backup gets significant attention. The hosts discuss air-gapped AWS Backup Vault support as a primary backup target, positioning it as especially relevant for teams where ransomware is on the threat list. These logically air-gapped vaults live in an Amazon-owned account and are locked by default with a compliance vault lock to ensure immutability.

The strong recommendation: enable AWS Backup for any important data, and keep backups isolated in a separate account from your workloads. If an attacker compromises your production account, they should not be able to reach your recovery copies. Related updates include KMS customer-managed key support for air-gapped vaults for better encryption flexibility, and GuardDuty Malware Protection for AWS Backup, which can scan backup artifacts for malware before restoration.

Data Protection in Databases

Dynamic data masking in Aurora PostgreSQL draws praise from both hosts. Using the new pg_columnmask extension, teams can configure column-level masking policies so that queries return masked data instead of actual values — for example, replacing credit card numbers with wildcards. The data in the database remains unmodified; masking happens at query time based on user roles.

Mattias compares it to capabilities already present in databases like Snowflake and highlights how useful it is when sharing data with external partners or other teams. When the idea of using masked production data for testing comes up, the hosts gently push back — don't do that — but both agree that masking at the database layer is a strong control because it reduces the risk of accidental data exposure through APIs or front-end applications.

Identity, IAM, and Federation Improvements

The episode covers several IAM-related features. AWS IAM Outbound Identity Federation allows federating AWS identities to external services via JWT, effectively letting you use AWS identity as a platform for authenticating to third-party services — similar to how you connect GitHub or other services to AWS today, but in the other direction.

The AWS Login CLI command provides short-lived credentials for IAM users who don't have AWS IAM Identity Center (SSO) configured. The hosts see it as a better alternative than storing static IAM credentials locally, but also question whether teams should still be relying on IAM users at all — their recommendation is to set up IAM Identity Center and move on.

The AWS Source VPC ARN condition key gets particular enthusiasm. It allows IAM policies to check which VPC a request originated from, enabling conditions like "allow this action only if the request comes from this VPC." For teams doing attribute-based access control in IAM, this is a significant addition.

AWS Secrets Manager Managed External Secrets is another useful feature that removes a common operational burden. Previously, rotating third-party SaaS credentials required writing and maintaining custom Lambda functions. Managed external secrets provides built-in rotation for partner integrations — Salesforce, BigID, and Snowflake at launch — with no Lambda functions needed.

Better Security at the Network and Service Layer

JWT verification in AWS Application Load Balancer simplifies machine-to-machine and service-to-service authentication. Teams previously had to roll their own Lambda-based JWT verification; now it is supported out of the box. The recommendation is straightforward: drop the Lambda and use the built-in capability.

AWS Network Firewall Proxy is in public preview. While the hosts have not explored it deeply, their read is that it could help with more advanced network inspection scenarios — not just outgoing internet traffic through NAT gateways, but potentially also traffic heading toward internal corporate data centers.

Developer-Oriented: REST API Streaming

Although the episode is mainly security-focused, the hosts include REST API streaming in Amazon API Gateway as a nod to developers. This enables progressive response payload streaming, which is especially relevant for LLM use cases where streaming tokens to clients is the expected interaction pattern. Mattias notes that applications are moving beyond small JSON payloads — streaming is becoming table stakes as data volumes grow.

Centralized Observability and Detection

CloudWatch unified management for operational, security, and compliance data promises cross-account visibility from a single pane of glass, without requiring custom log aggregation pipelines built from Lambdas and glue code. The hosts are optimistic but immediately flag the cost: CloudWatch data ingest pricing can escalate quickly when dealing with high-volume sources like access logs. Deep pockets may be required.

Detection is a recurring theme throughout the episode. The hosts discuss CloudTrail Insights for data events (useful if you are already logging data-plane events — another deep-pockets feature), extended threat detection for EC2 and ECS in GuardDuty using AI-powered analysis to correlate security signals across network activity, runtime behavior, and API calls, and the public preview of AWS Security Agent for automated security investigation.

On GuardDuty specifically, the recommendation is clear: if you don't have it enabled, go enable it — it gives you a good baseline that works out of the box across your services with minimal setup. You can always graduate to more sophisticated tooling later, but GuardDuty is the gap-stopper you start with.

Mattias drives the broader point home: incidents are inevitable, and what you can control is how fast you detect and respond. AWS is clearly investing heavily in the detection side, and teams should enable these capabilities as fast as possible.

Control Tower, Organizations, and Guardrails at Scale

Several updates make governance easier to adopt at scale: - Dedicated controls for AWS Control Tower without requiring a full Control Tower deployment — you can now use Control Tower guardrails à la carte. - Automatic enrollment in Control Tower — a feature the hosts feel should have existed already. - Required tags in Organizations stack policies — enforcing tagging standards at the organization level. - Amazon Inspector organization-wide management — centralized vulnerability scanning across all accounts. - Billing transfer for AWS Organizations — useful for AWS resellers managing multiple organizations. - Delete protection for CloudWatch Log Groups — a small but important safeguard.

Mattias says plainly: everyone should use Control Tower.

MCP Servers and AWS's Evolving AI Approach

The conversation shifts to the public preview of AWS MCP (Model Context Protocol) servers. Unlike traditional locally-hosted MCP servers that proxy LLM requests to API calls, AWS is taking a different approach with remote, fully managed MCP servers hosted on AWS infrastructure. These allow AI agents and AI-native IDEs to interact with AWS services over HTTPS without running anything locally.

AWS launched four managed MCP servers — AWS, EKS, ECS, and SageMaker — that consolidate capabilities like AWS documentation access, API execution across 15,000+ AWS APIs, and pre-built agent workflows. However, the IAM model is still being worked out: you currently need separate permissions to call the MCP server and to perform the underlying AWS actions it invokes. The hosts treat this as interesting but still evolving.

Boris: AI for AWS Change Awareness

Toward the end of the episode, Andrey reveals a personal project: Boris (getboris.ai), an AI-powered DevOps teammate he has been building. Boris connects to the systems an engineering team already uses and provides evidence-based answers and operational automation.

The specific feature Andrey has been working on takes the AWS RSS feed — where new announcements land daily — and cross-references it against what a customer actually has running in their AWS Organization. Instead of manually sifting through hundreds of releases, Boris sends a digest highlighting only the announcements relevant to your environment and explaining how you would benefit.

Mattias immediately connects this to the same problem in security: teams are overwhelmed by the constant flow of feature updates and vulnerability news. Having an AI that filters and contextualizes that information is, in his words, "brilliant."

Andrey also announces that Boris has been accepted into the Tehnopol AI Accelerator in Tallinn, Estonia — a program run by the Tehnopol Science and Business Park that supports early-stage AI startups — selected from more than 100 companies.

Highlights

Setting expectations: "The selection of announcements smells more like security only. If security is your thing, stay tuned in. If it's not really, well, it's still interesting, but I'm just trying to manage your possible disappointment."
On VPC encryption control: The hosts describe how proving internal encryption used to require traffic mirroring, Wireshark, and significant pain — this feature makes it a configuration toggle.
On public S3 buckets: "In 2026 there is no good reason to have a public S3 bucket. Just turn it on and forget about it."
On production data for testing: When someone floats using masked production data for testing — "Maybe don't do that."
On detection over prevention: "You cannot really prevent something from happening in your environment. What you can control is how you react when it's going to happen. Detection is really where I put effort."
On Boris: When Andrey describes how Boris watches the AWS release feed and tells you which announcements actually matter for your environment, Mattias's reaction: "This is brilliant."
On getting started with AWS security: "If you are a startup building on AWS and compliance is important, it's quite easy to get it working. All the building blocks and tools are available for you to do the right things."

Resources

Introducing VPC Encryption Controls — AWS blog post explaining monitor and enforce modes for VPC encryption in transit.
AWS Post-Quantum Cryptography — AWS overview of post-quantum cryptography migration, including ML-KEM support across S3, ALB, NLB, KMS, and Private CA.
S3 Block Public Access Organization-Level Enforcement — Announcement for enforcing S3 public access blocks across an entire AWS Organization.
Amazon ECR Managed Container Image Signing — Guide to setting up managed image signing with ECR and AWS Signer.
GuardDuty Extended Threat Detection for EC2 and ECS — How GuardDuty uses AI/ML to correlate security signals and detect multi-stage attacks on compute workloads.
Dynamic Data Masking for Aurora PostgreSQL — How to configure column-level data masking with the pg_columnmask extension.
Understanding IAM for Managed AWS MCP Servers — AWS Security Blog post explaining the IAM permission model for remote MCP servers.
B.O.R.I.S — Your AI DevOps Teammate — The AI-powered product discussed in the episode that tracks AWS announcements and matches them to your environment.

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Summary

Key Topics AWS re:Invent Through a Security Lens

VPC Encryption and Post-Quantum Readiness

S3 Security Controls and Access Management

Container Image Signing

Backups, Air-Gapping, and Ransomware Resilience

Data Protection in Databases

Identity, IAM, and Federation Improvements

Better Security at the Network and Service Layer

Developer-Oriented: REST API Streaming

Centralized Observability and Detection

Control Tower, Organizations, and Guardrails at Scale

Mattias says plainly: everyone should use Control Tower.

MCP Servers and AWS's Evolving AI Approach

Boris: AI for AWS Change Awareness

Highlights

Setting expectations: "The selection of announcements smells more like security only. If security is your thing, stay tuned in. If it's not really, well, it's still interesting, but I'm just trying to manage your possible disappointment."
On VPC encryption control: The hosts describe how proving internal encryption used to require traffic mirroring, Wireshark, and significant pain — this feature makes it a configuration toggle.
On public S3 buckets: "In 2026 there is no good reason to have a public S3 bucket. Just turn it on and forget about it."
On production data for testing: When someone floats using masked production data for testing — "Maybe don't do that."
On detection over prevention: "You cannot really prevent something from happening in your environment. What you can control is how you react when it's going to happen. Detection is really where I put effort."
On Boris: When Andrey describes how Boris watches the AWS release feed and tells you which announcements actually matter for your environment, Mattias's reaction: "This is brilliant."
On getting started with AWS security: "If you are a startup building on AWS and compliance is important, it's quite easy to get it working. All the building blocks and tools are available for you to do the right things."

Resources

Introducing VPC Encryption Controls — AWS blog post explaining monitor and enforce modes for VPC encryption in transit.
AWS Post-Quantum Cryptography — AWS overview of post-quantum cryptography migration, including ML-KEM support across S3, ALB, NLB, KMS, and Private CA.
S3 Block Public Access Organization-Level Enforcement — Announcement for enforcing S3 public access blocks across an entire AWS Organization.
Amazon ECR Managed Container Image Signing — Guide to setting up managed image signing with ECR and AWS Signer.
GuardDuty Extended Threat Detection for EC2 and ECS — How GuardDuty uses AI/ML to correlate security signals and detect multi-stage attacks on compute workloads.
Dynamic Data Masking for Aurora PostgreSQL — How to configure column-level data masking with the pg_columnmask extension.
Understanding IAM for Managed AWS MCP Servers — AWS Security Blog post explaining the IAM permission model for remote MCP servers.
B.O.R.I.S — Your AI DevOps Teammate — The AI-powered product discussed in the episode that tracks AWS announcements and matches them to your environment.

Episode #: 93
Duration: 1650
Publish On: 05 Mar 2026 23:25 PM
Share

#92 - From System Initiative to SWAMP: Agent-Native Infra with Paul Stack What can you automate with SWAMP today, from AWS to a Proxmox home lab? How do skills, scripts, and reusable workflows plug into your stack? Could this be your agent’s missing guardrails?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Summary

System Initiative has undergone a dramatic transformation: from a visual SaaS infrastructure platform with 17 employees to Swamp, a fully open-source CLI built for AI agents, maintained by a five-person team whose initials literally spell the product name. Paul Stack returns for his third appearance on the show to explain why the old model failed — and why handing an AI agent raw CLI access to your cloud is, as Andrey puts it, just "console-clicking in the terminal." The conversation gets sharp when the hosts push on what problem Swamp actually solves, whether ops teams are becoming the next bottleneck in AI-era delivery, and why Paul believes the right move is not replacing Terraform but giving AI a structured system it can reason about. Paul also drops a parting bombshell: he hasn't written a single line of code in four weeks.

Key Topics System Initiative's pivot from visual editor to AI-first CLI

Paul Stack explains that System Initiative spent over five years iterating on a visual infrastructure tool where users could drag, drop, and connect systems. Despite the ambition, the team eventually concluded that visual composition was too slow, too cumbersome, and too alien for practitioners accustomed to code, artifacts, and reviewable changes.

The shift started in summer 2025 when Paul spiked a public OpenAPI-spec API. A customer then built an early MCP (Model Context Protocol) server on top of it — a prototype that worked but had no thought given to token usage or tool abstraction. System Initiative responded by building its own official MCP server and pairing it with a CLI. The results were dramatically better: customers could iterate easily from the command line or through AI coding tools like Claude Code.

By Christmas 2025 the writing was on the wall. The CLI-plus-agent approach was producing better outcomes, while the company was still carrying hundreds of thousands of lines of code for a distributed SaaS platform built for a previous product direction. In mid-January 2026, the company made the call to rethink everything from first principles.

The team behind the name

The restructuring was painful. System Initiative went from 17 people to five. Paul explains the reasoning candidly: when you don't know what the tool is going to be, keeping a large team around is unfair to them, bad for their careers, and expensive. The five who stayed were the CEO, VP of Business, COO, Paul (who ran product), and Nick Steinmetz, the head of infrastructure — who also happened to be System Initiative's most active internal user, having used the platform to build System Initiative itself.

Those five people's initials spell SWAMP. The name was unintentional but stuck — and Paul notes with a grin that if they ever remove the "P," it becomes "SWAM," so he's safe even if he leaves. Beyond the joke, the name fits: Swamp stores operational data in a local .swamp/ directory — not a neatly formatted data lake, but a structured store that AI agents can pull from to reason about infrastructure state and history.

Why raw AI agent access to infrastructure is dangerous

A major theme in the conversation is that letting an AI agent operate infrastructure directly — through the AWS CLI or raw API calls — is fundamentally unreliable. Andrey lays out the problem clearly: this kind of interaction is equivalent to clicking around the cloud console, just automated through a terminal. It is not repeatable, not reviewable, and inherits the non-deterministic behavior of LLMs. If the agent's context window fills up, it starts to forget earlier decisions and improvises — a terrifying prospect for production infrastructure.

What made System Initiative's earlier MCP-based direction compelling, in Andrey's view, was the combination of guardrails, repeatability, and human review. The agent generates a structured specification, a human reviews it, and only then is it applied. Paul agrees and calls this the "agentic loop with the human loop" — the strongest pattern they found.

Token costs and the case for local-first architecture

Paul shares a hard-won lesson from building MCP integrations: a poorly designed MCP server burns enormous amounts of tokens and creates unnecessary costs for users. He spent three weeks in December reworking the server to use progressive context reveal rather than flooding the model with data. Even so, the fundamental problem with a SaaS-first architecture remained — constantly transmitting context between a central API and the user's agent was expensive regardless of optimization.

That experience pushed the team toward a local-first design. Swamp keeps data on the user's machine, close to where the agent operates, giving AI the context it needs without the round-trip overhead and cost of a remote service.

What Swamp actually is

Swamp is a general-purpose, open-source CLI automation tool — not just another infrastructure-as-code framework. Its core building blocks are:

Models: typed schemas with explicit inputs, outputs, and methods. Unlike traditional IaC resource definitions limited to CRUD operations, Swamp models can have methods like analyze or do_next, with the procedural logic living inside the method itself.
Workflows: the orchestration layer that interacts with APIs, CLIs, or any external system. Workflows take inputs, can be composed (a workflow can orchestrate other workflows), and produce artifacts that the AI agent can inspect over time.
Skills: Claude Code markdown files and shell scripts that teach the AI agent how to build models and workflows within Swamp's architecture.

Critically, Swamp ships with zero built-in models — no pre-packaged AWS EC2, VPC, or GCP resource definitions. Instead, the AI agent uses installed skills to generate models on the fly. Paul describes a user who joined the Discord that very morning, asked Swamp to create a schema for managing Let's Encrypt certificates, and it worked on the first attempt without writing any code.

Nick Steinmetz provides another example: he manages his homelab Proxmox hypervisor entirely through Swamp — creating and starting VMs, inspecting hypervisor state, and monitoring utilization. He recently connected it to Discord so friends can run commands like @swamp create vm to spin up Minecraft and gaming servers on demand.

How Swamp fits with AI coding tools

The hosts spend significant time pinning down where Swamp sits relative to tools like Claude Code, bash access, and existing automation. Paul is clear: Swamp is not an AI wrapper or chatbot. It is a structured runtime that gives agents guardrails and reusable patterns.

Mattias works through several analogies to help frame it — is it like n8n or Zapier for the CLI? A CLI-based Jenkins where jobs are agents? Paul settles on this: it is a workflow engine driven by typed models, where data can be chained between steps using CEL (Common Expression Language) expressions — the same dot-notation referencing used in Kubernetes API declarations. A simple example: create a VPC in step one, then reference VPC.resource.attributes.vpcid as input to a subnet model in step two.

In Paul's personal workflow, he uses Claude Code to generate models and workflows, checks them into Git for peer review, and then runs them manually or through CI at a time of his choosing. He has explicitly configured Claude with a permission deny on workflow run — the agent helps build automation but never executes it. The same CLI works whether a person or an agent runs it; the difference is timing and approval.

Reusability, composition, and Terraform interop

Swamp workflows are parameterized and reusable across environments. If they grow unwieldy, workflows can orchestrate other workflows, collect outputs, and manage success conditions — similar to GitHub Actions calling other actions.

Paul also demonstrates that Swamp can sit alongside existing tooling rather than replacing it. In a live Discord session, he built infrastructure models in Swamp and then asked the AI agent to generate the equivalent Terraform configuration. Because the agent had typed models with explicit relationships, it produced correct Terraform with proper resource dependencies. This positions Swamp less as a replacement mandate and more as a reasoning and control layer that can output to whatever format teams already use.

When one of the hosts compares Swamp to general build systems like Gradle, Paul draws a key distinction: traditional tools were designed for humans to write, review, and debate. Swamp is designed for AI agents to inspect and operate within. He references Anton Babenko's widely-used terraform-aws-vpc module — with its 237+ input variables — as an example of a human-centric design that agents struggle with due to version dependencies, module structure complexity, and stylistic decisions baked in over years. Swamp instead provides the agent with structured context, explicit typing, and historical artifacts it can query.

Open source, AGPL v3, and monetization

Paulina asks the natural question: if Swamp is fully open source under AGPL v3, how does the company make money?

Paul is candid that monetization is not the immediate priority — the focus is building a tool that resonates with users first. But he outlines a potential model: a marketplace-style ecosystem where users can publish their own models and workflows, while System Initiative offers supported, maintained, and paid-for versions of commonly needed building blocks. He draws a loose comparison to Docker Hub's model of community images alongside official ones.

The deeper argument is strategic: Paul believes there is no longer a durable moat in software. If users dislike a tool today, AI makes it increasingly feasible to build their own. Rather than trying to control all schemas and code, the team wants to make Swamp so extensible that users build on top of it rather than walking away from it.

Are ops teams becoming the next bottleneck?

Paul argues that software development productivity is accelerating so fast with AI that ops teams risk becoming the next bottleneck — echoing earlier industry transitions from physical servers to cloud and from manual provisioning to infrastructure as code. Development teams can now move at a pace that traditional infrastructure workflows cannot match.

Andrey agrees with the premise but pushes back on where the bottleneck actually sits today. In his experience — spending "day and night burning tokens" on AI-assisted development — the real constraint is testing, not deployment. He describes pipelines that can go from idea to pull request automatically, but stall without a strong test harness and end-to-end validation. Without sufficient tests, you never even reach the deployment phase.

Paul accepts the framing and says the goal of Swamp is to strip away lower-value friction — fighting with file layouts, naming conventions, writing boilerplate models — so teams can invest their time where engineering rigor still matters most: testing, validation, and production safety.

Swamp as an addition, not a forced replacement

Paul closes with an important positioning point: Swamp does not require teams to discard their Terraform, Pulumi, or existing infrastructure investments. It can be introduced alongside current tooling to interrogate infrastructure, validate what existing IaC does, and extend automation in AI-native ways. The extensibility is the point — users control when things run, what models to build, and how to integrate with their existing stack.

Highlights "Giving an agent raw CLI access to your cloud is basically console-clicking in the terminal." — Andrey

Andrey challenges the assumption that AI-driven infrastructure is automatically safer. If an agent is just shelling out to the AWS CLI, the result may be fast — but it is non-deterministic, non-repeatable, and forget-prone once the context window fills up.

The future of infra automation needs guardrails before it needs speed. Listen to hear why structured workflows beat flashy demos.

"The best loop was the agentic loop with the human loop." — Paul Stack

The breakthrough was not autonomous infrastructure execution. It was letting the AI generate structured specs while humans stay in charge of review and execution. Paul even blocks Claude Code from running workflows directly on his machine.

If "human in the loop" sounds conservative, this episode makes the case that it is the only production-safe pattern we have. Listen for the full argument.

"There is no longer a moat in software." — Paul Stack

Paul argues that AI has changed the economics of building software so fundamentally that no team can rely on implementation complexity as a competitive advantage. If users dislike your tool, they can build their own — faster than ever before.

That belief is why Swamp is open source, extensible, and ships with zero built-in models. Listen for a candid take on product strategy when anyone can clone your work.

"Ops teams are going to become the bottlenecks that we once were." — Paul Stack

As development velocity explodes with AI, Paul warns that infrastructure teams risk slowing everything down — the same pattern that played out in the shifts from physical servers to cloud and from cloud to IaC.

Andrey fires back: the real bottleneck today is testing, not deployment. Listen for a sharp debate on where delivery pipelines are actually stuck.

"I haven't written a single line of code in four weeks." — Paul Stack

Paul reveals that the entire Swamp repository is AI-generated, with four machines running in parallel to churn out plans and implementations — including customer feature requests. The team teases a future episode to compare notes on AI-driven development workflows.

If that claim doesn't make you want to hear the follow-up, nothing will.

Resources

Swamp CLI on GitHub — The open-source, AGPL v3 licensed CLI tool discussed in the episode. Models, workflows, and a local .swamp/ data directory designed for AI agent interaction.
System Initiative — The company behind Swamp, originally known for its visual infrastructure platform, now pivoted to AI-native CLI automation.
Model Context Protocol (MCP) — Anthropic's open protocol for connecting AI models to external tools and data sources. Paul discusses the challenges of building MCP servers that are token-efficient.
Claude Code — Anthropic's agentic coding tool that runs in the terminal. Used throughout the episode as the primary AI agent interface for Swamp workflows.
CEL — Common Expression Language — The expression language Swamp uses for chaining data between workflow steps, similar to how Kubernetes uses it for API declarations and validation policies.
Proxmox Virtual Environment — The open-source hypervisor platform that Nick Steinmetz manages entirely through Swamp in his homelab, including Discord-driven VM creation.
terraform-aws-modules/vpc — Anton Babenko's widely-used Terraform VPC module, referenced by Paul as an example of human-centric IaC design with 237+ inputs that agents struggle to navigate.

What can you automate with SWAMP today, from AWS to a Proxmox home lab? How do skills, scripts, and reusable workflows plug into your stack? Could this be your agent’s missing guardrails?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Summary

Key Topics System Initiative's pivot from visual editor to AI-first CLI

The team behind the name

Why raw AI agent access to infrastructure is dangerous

Token costs and the case for local-first architecture

What Swamp actually is

Swamp is a general-purpose, open-source CLI automation tool — not just another infrastructure-as-code framework. Its core building blocks are:

Models: typed schemas with explicit inputs, outputs, and methods. Unlike traditional IaC resource definitions limited to CRUD operations, Swamp models can have methods like analyze or do_next, with the procedural logic living inside the method itself.
Workflows: the orchestration layer that interacts with APIs, CLIs, or any external system. Workflows take inputs, can be composed (a workflow can orchestrate other workflows), and produce artifacts that the AI agent can inspect over time.
Skills: Claude Code markdown files and shell scripts that teach the AI agent how to build models and workflows within Swamp's architecture.

How Swamp fits with AI coding tools

Reusability, composition, and Terraform interop

Open source, AGPL v3, and monetization

Paulina asks the natural question: if Swamp is fully open source under AGPL v3, how does the company make money?

Are ops teams becoming the next bottleneck?

Swamp as an addition, not a forced replacement

Highlights "Giving an agent raw CLI access to your cloud is basically console-clicking in the terminal." — Andrey

The future of infra automation needs guardrails before it needs speed. Listen to hear why structured workflows beat flashy demos.

"The best loop was the agentic loop with the human loop." — Paul Stack

If "human in the loop" sounds conservative, this episode makes the case that it is the only production-safe pattern we have. Listen for the full argument.

"There is no longer a moat in software." — Paul Stack

That belief is why Swamp is open source, extensible, and ships with zero built-in models. Listen for a candid take on product strategy when anyone can clone your work.

"Ops teams are going to become the bottlenecks that we once were." — Paul Stack

Andrey fires back: the real bottleneck today is testing, not deployment. Listen for a sharp debate on where delivery pipelines are actually stuck.

"I haven't written a single line of code in four weeks." — Paul Stack

If that claim doesn't make you want to hear the follow-up, nothing will.

Resources

Swamp CLI on GitHub — The open-source, AGPL v3 licensed CLI tool discussed in the episode. Models, workflows, and a local .swamp/ data directory designed for AI agent interaction.
System Initiative — The company behind Swamp, originally known for its visual infrastructure platform, now pivoted to AI-native CLI automation.
Model Context Protocol (MCP) — Anthropic's open protocol for connecting AI models to external tools and data sources. Paul discusses the challenges of building MCP servers that are token-efficient.
Claude Code — Anthropic's agentic coding tool that runs in the terminal. Used throughout the episode as the primary AI agent interface for Swamp workflows.
CEL — Common Expression Language — The expression language Swamp uses for chaining data between workflow steps, similar to how Kubernetes uses it for API declarations and validation policies.
Proxmox Virtual Environment — The open-source hypervisor platform that Nick Steinmetz manages entirely through Swamp in his homelab, including Discord-driven VM creation.
terraform-aws-modules/vpc — Anton Babenko's widely-used Terraform VPC module, referenced by Paul as an example of human-centric IaC design with 237+ inputs that agents struggle to navigate.

Episode #: 92
Duration: 2875
Publish On: 20 Feb 2026 21:06 PM
Share

#91 - January security roundup: CVSS 10 in n8n, self-hosted AI scares, and nonstop patching

We kick off with a CVSS 10 in n8n, then look at self-hosted AI assistants with weak defaults and prompt injection risks. Are your API keys, inbox, and drives safe if a bot is open to the web? What should you rotate, patch, and hide behind a VPN?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 91
Duration: 2641
Publish On: 04 Feb 2026 22:51 PM
Share

#90 - K8s vs Managed Services: Cost, Lock-In, and Reality

We get into K8s vs native orchestrators. Do you still need Kubernetes when managed services cover most needs? How do cost, lock-in, and team skills change the choice? Expect a heated debate.

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 90
Duration: 3104
Publish On: 19 Jan 2026 15:40 PM
Share

#89 - Agents, Reviews, and Secrets: Real Talk on AI in Dev

Are devs ignoring AI, misusing it, or getting real value? What happens when agents touch your env vars, repos, and pipelines? How do you share prompts, set team defaults, and keep trust? Could an AI engineer role lead culture as well as tools?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 89
Duration: 2041
Publish On: 05 Jan 2026 15:25 PM
Share

#88 - EU Compliance 101: DSA, MiCA explained

Which parts of AI Act, NIS2, DORA, and DSA overlap so you can cover more with less? What basics raise your baseline fast: central logs, backups, risk assessments, and human-in-the-loop governance? Could a simple mailing list make incident comms painless?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 88
Duration: 1856
Publish On: 22 Dec 2025 15:15 PM
Share

#87 - EU Compliance 101: AI Act, DORA, NIS2 explained

Want a quick map of EU compliance for engineers? How do you classify AI by risk and tell users when AI is used? When do you send a 24-hour heads-up and a one-month report after an incident? Does NIS2 make your board liable and your logs mandatory?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 87
Duration: 2300
Publish On: 08 Dec 2025 13:31 PM
Share

#86 - MCP plugins: your next security blind spot?

Is MCP just another server you need to threat model, patch, and monitor? How do you keep users from over-privileged access, block LLM injection, and stop blind spots? We unpack the VentureBeat article https://venturebeat.com/security/mcp-stacks-have-a-92-exploit-probability-how-10-plugins-became-enterprise with real-world tips.

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 86
Duration: 3894
Publish On: 21 Nov 2025 16:25 PM
Share

#85 - Is It Time for OpenTofu? Our HashiConf Takeaways

We break down 10 years of HashiConf and this year's Terraform-heavy news. What do Terraform Actions with Ansible, Stacks GA, and HCP-only features mean for day two work? Is open source getting left behind, and is OpenTofu worth a look?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 85
Duration: 1846
Publish On: 23 Oct 2025 13:36 PM
Share

#84 - AI for DevSecOps: Current Wins and Ongoing Gaps

Can AI really help us build more secure software? What’s working in practice right now, and where do the tools still fall short? Mattias and Paulina share their views.

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 84
Duration: 2122
Publish On: 30 Sep 2025 10:50 AM
Share

#83 - Opentofu Vs Terraform: Where We Are Now With Cole Bittel

It’s been a while since OpenTofu was released to the public, so we wanted to check in on where it stands today. How is the community adopting it? What’s the public sentiment? And how does it differ from Terraform in terms of features?

This time we’re joined by Cole Bittel, an experienced SRE, platform engineer, and contributor to OpenTofu. He shares his hands-on experience migrating to OpenTofu, and we look into the problems teams face with infrastructure as code and how both Terraform and OpenTofu approach solving them.

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 83
Duration: 2325
Publish On: 17 Sep 2025 13:23 PM
Share

#82 - Tools, Mcps, And Attack Scenarios

This time we talk about how LLMs use tools and what the Model Context Protocol (MCP) brings to the table. What are the risks? How can an attacker exploit MCPs? And why are LLMs a bit like grandpas — helpful but forgetful?

We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

DevSecOps Talks podcast LinkedIn page

DevSecOps Talks podcast website

DevSecOps Talks podcast YouTube channel

Episode #: 82
Duration: 2217
Publish On: 25 Aug 2025 07:45 AM
Share

#81 - Keeping Secrets Safe

Still pasting tokens into Slack? What types of secrets are at risk, and which tools fit which consumer—humans, CI/CD, or workloads? Where do most teams stumble, and how do you fix it fast? Hear our no-nonsense checklist.

Connect with us on LinkedIn or X (see info at https://devsecops.fm/about/). We are happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners.

The video version of this episode is available on our YouTube channel

LinkedIn page of the DevSecOps Talks team is here

Episode #: 81
Duration: 2015
Publish On: 30 Jun 2025 13:45 PM
Share

#80 - Understanding Passkeys: Benefits And Limitations

Passkeys are gaining attention as a new way to log in without passwords. How do they work, and how do they compare to traditional multi-factor authentication (MFA)? In this episode, we explore the history of passwords, the strengths and weaknesses of common MFA methods, and the potential of passkeys to enhance security. What threats do passkeys mitigate, and what still remain?