Remember when the biggest debate in your sprint planning was whether that story was a 5 or an 8? Good times. Simpler times. Times that are about to get very, very weird.
Agentic AI (the kind that doesn’t just autocomplete your for-loops but actually goes off, reads your Jira tickets, writes the code, runs the tests, and opens a pull request while you’re still arguing about acceptance criteria) is changing how software gets built. And that means the rituals we’ve built around building software are due for a serious shake-up.
Let’s look at a few beloved Scrum ceremonies and development practices, appreciate them for what they are, and then talk about what’s coming next.
Sprint Planning: From Poker to… Prompt Engineering?
How it works today: The team gathers (in person or, more realistically, in a Teams call where three people are muted and one is definitely making coffee). The product owner presents the backlog. The team discusses. Story points get assigned through planning poker, a process that somehow combines democratic voting with the psychological dynamics of a high-stakes card game.
The whole thing takes one to two hours, during which at least one developer will utter the phrase “it depends on what we mean by done.”
How it’s changing: When AI agents can implement entire features from spec to PR, the nature of planning shifts dramatically. As CIO reports , stories can now be much larger in scope because “AI agents can perform more work in less time than humans.” That 5-pointer? It might become a 40-pointer, but the agent handles it in an afternoon.
Sprint planning starts to look less like “how much can we squeeze into two weeks” and more like “what are the right problems to solve next.” Some are already calling this shift from Sprint Planning to “Intent Design”: defining goals, constraints, and guardrails rather than decomposing everything into bite-sized tasks.
The irony? You’re now spending planning time writing really good prompts and specs instead of really good user stories. Meet the new boss, same as the old boss.
Estimation and Refinement: The Robots Don’t Need Fibonacci
How it works today: Refinement is where the team takes vague requirements and turns them into slightly less vague requirements. You discuss edge cases, draw diagrams on virtual whiteboards, and eventually land on something everyone can pretend to agree on.
Estimation follows, typically using story points, a unit of measurement so abstract that no two teams in the world use it the same way. The entire ritual is built on the premise that humans are terrible at predicting how long things take (we are) but that relative sizing helps (it does, sort of).
How it’s changing: AI is getting actually good at analyzing past sprints, commit histories, and code complexity to produce effort estimates that are more data-driven than our gut feelings ever were. Some teams report estimation meetings shrinking by up to 60%.
But here’s the deeper shift: when an AI agent is the one building the feature, estimation becomes less about “how long will this take a human” and more about “how good is this spec.” A vague requirement that a senior developer could muscle through might completely trip up an AI agent. Refinement sessions become spec quality reviews, making sure the intent is precise enough for an autonomous agent to execute correctly. Storey’s Triple Debt Model calls this intent debt: when goals, constraints, and specifications are poorly articulated, neither humans nor AI agents can work effectively — and the paper advocates for intent-first workflows as the antidote.
GitHub’s Spec Kit is already formalizing this idea: specs as the central artifact that drive AI implementation, checklists, and task breakdowns. Your refinement session just became a spec-writing workshop.
The Iteration Cycle: Two Weeks Feels Like an Eternity
How it works today: The classic two-week sprint. You plan, you build, you demo, you retro. It’s a rhythm. It’s comforting. It’s also a tempo that was designed around the speed at which humans can build, test, and deliver working software.
How it’s changing: When agents can go from ticket to working code in hours rather than days, the two-week sprint starts to feel like sending a letter by horse when you have a telephone . Steve Jones from Capgemini provocatively argued that “Agentic SDLCs are too fast for Agile,” that the traditional cadence simply doesn’t match the velocity AI enables.
Others disagree. Sonya Siderova from Nave offers a more nuanced take: “Agile isn’t dead. It’s optimizing a constraint that moved.” The bottleneck isn’t code production anymore; it’s decision-making and validation. You can generate ten features in an afternoon, but can you review them all properly? Can you validate they actually solve user problems?
The iteration cycle doesn’t disappear; it just shifts what you’re iterating on. Less “build, test, ship.” More “specify, generate, validate, refine the spec, regenerate.” The loop gets tighter, but the human judgment inside it becomes more critical, not less.
Feature Discovery: Everyone’s a Product Manager Now
How it works today: Product discovery involves user research, stakeholder interviews, prototype testing, and a product manager synthesizing all of that into a coherent roadmap. Developers typically enter the picture after decisions about what to build have already been made.
How it’s changing: This is where it gets interesting. When the cost of building a prototype drops to near-zero (just describe it and an agent builds it), the entire discovery process can become more experimental. Instead of debating whether Feature A or Feature B is the right call, you can… just build both. Test them. Throw away the loser.
As CIO notes , agentic engineering is “transforming all members of the development team into product managers since their main purpose is to specify what a software product should do rather than building the code necessary to do it.”
This sounds liberating until you realize that being a product manager is really hard. Knowing what to build is the hard part; it always has been. AI just makes the gap between “having a good idea” and “having working software” much shorter. That’s powerful, but it doesn’t magically make you better at having good ideas.
The Uncomfortable Middle Ground
We’re in an awkward transition phase. Kent Beck (yes, that Kent Beck) draws a useful distinction between “augmented coding” (maintaining engineering rigor while letting AI handle implementation) and undisciplined “vibe coding” (letting AI generate stuff and hoping for the best). One of these approaches leads to maintainable software. The other leads to tech debt that would make your current codebase look like a pristine museum exhibit.
Margaret-Anne Storey’s Triple Debt Model puts a finer point on the risks. She identifies three forms of debt that accumulate when teams lean on AI without sufficient rigor: cognitive debt (the erosion of team understanding when AI generates code faster than humans can build the mental models needed to safely change it), intent debt (missing rationale and specifications that neither humans nor agents can recover later), and cognitive surrender (adopting AI outputs with minimal scrutiny, which inflates the team’s confidence even when the AI is wrong). Undisciplined vibe coding triggers all three simultaneously.
Meanwhile, Forrester reports that 95% of professionals still affirm Agile’s relevance, but nearly half are already integrating generative AI into their agile practices. The ceremonies aren’t dying; they’re evolving.
So What Should You Actually Do?
If you’re a team lead or scrum master reading this and feeling mildly panicked, some practical advice:
Invest in spec-writing skills. The quality of your specifications is about to matter a lot more than it used to. Vague requirements that smart developers could “figure out” won’t cut it when agents are doing the building. Storey’s research on intent debt warns that this compounds over time: “each generation of AI-assisted development not only carries the debt forward but compounds it.” Investing in spec-writing is urgent, not optional.
Shorten your feedback loops. If you’re not already doing continuous deployment, this is your nudge. The value of a two-week demo cycle drops when you could be validating daily.
Focus planning on outcomes, not outputs. Stop planning tasks and start planning experiments. What hypothesis are you testing? What will you learn?
Keep humans in the validation loop. AI can build fast, but it can also build the wrong thing fast. The research is clear : “AI agents generally have a higher likelihood of implementing code that causes end-user problems.” Storey’s work on cognitive debt and cognitive surrender adds another warning: teams experience inflated confidence — “the team feels they understand the system better than they do” — which is exactly why human review is critical even when things seem to be working fine.
Stay curious, stay skeptical. The hype cycle is real. Not every ceremony needs to be “disrupted.” Some things (like talking to your teammates about what you’re building and why) remain valuable regardless of who (or what) is writing the code.
The future of software development isn’t about AI replacing developers. It’s about the entire team getting much better at knowing what to build, and letting agents handle the how. That’s a real paradigm shift, and it’s happening whether your sprint retro is ready for it or not.
Further Reading:
- 5 Ways Agentic Engineering Transforms Agile Practices — CIO
- How Agentic AI Will Reshape Engineering Workflows in 2026 — CIO
- Does AI Make the Agile Manifesto Obsolete? — InfoQ
- AI-Enhanced Sprint Planning — Scrum.org
- The Agile AI Manifesto — Scrum.org
- Toward Agentic Software Project Management — arXiv
- From Technical Debt to Cognitive and Intent Debt — Margaret-Anne Storey (arXiv)