AI-Powered Stack — Sprint zero for XM Cloud with agentic AI

Sprint Zero on XM Cloud projects used to eat up two to three weeks of my time—crawling the legacy site, drafting requirements, building the backlog, documenting architecture decisions. Most of that work is necessary but tedious: reading RFPs, cross-referencing stakeholder notes, sizing stories, keeping documents in sync.

Over the past year I’ve been offloading chunks of that work to AI agents. Not as some futuristic experiment, but because I got tired of the grind. The agents handle the synthesis and formatting; I focus on the parts that actually need judgment—architecture trade-offs, stakeholder conversations, risk assessment.

Here’s what the pipeline looks like now:

Building a ground truth workspace

Before any planning happens, I dump everything into a private knowledge base—RFPs, stakeholder notes, past audits, CRM transcripts, whatever I can get my hands on. The goal is to have one place where I (and the agents) can query project context without hallucinating.

I use NotebookLM for the conversational interface—it’s good for quick summaries and the “Audio Overview” feature is surprisingly useful for exec briefings. For anything that needs proper access control or will be used in production prompts, I mirror the same docs into Azure OpenAI On Your Data with Azure AI Search underneath. That setup enforces citations and keeps things grounded.

The other thing I set up early is a simple taxonomy spreadsheet—personas, journeys, KPIs. Nothing fancy, but it gives every agent prompt a consistent vocabulary.

What comes out: a queryable knowledge base, an index definition, and a taxonomy that anchors everything else.

Recon: knowing the site better than the client

By day two, I want to know the legacy site inside out. That used to mean hours of clicking around and taking notes. Now I run a Playwright crawler that captures URLs, headings, ARIA roles, and component patterns. The output is JSON that feeds directly into agent prompts.

I hand that crawl data to Claude Code with a prompt that classifies pages, flags outdated patterns, and surfaces content gaps. It’s not perfect—sometimes it misses context that’s obvious to a human—but it’s a solid first pass that I can validate with stakeholders.

Prompt I use for site analysis:

<context>
You are a senior Sitecore solutions architect analyzing a legacy website for migration to XM Cloud. You have access to Playwright crawl data (JSON) containing URLs, page titles, headings, meta tags, ARIA landmarks, and DOM structure snapshots.
</context>

<task>
Analyze the crawl data and produce a comprehensive site audit for Sprint Zero planning.
</task>

<crawl_data>
[Attach or reference the Playwright JSON output]
</crawl_data>

<instructions>
1. **Classify pages by template type:**
   - Identify patterns (home, landing, article, product, listing, form, utility)
   - Note URL structure conventions
   - Flag orphan pages or inconsistent patterns

2. **Inventory component patterns:**
   - List recurring UI patterns (heroes, cards, accordions, carousels, CTAs)
   - Estimate frequency across the site
   - Note variations that may need consolidation

3. **Assess content structure:**
   - Identify content hierarchies and navigation patterns
   - Flag deeply nested or flat structures that may need IA work
   - Note any personalization or audience targeting visible in markup

4. **Surface technical concerns:**
   - Outdated patterns (inline styles, table layouts, legacy JS)
   - Accessibility gaps (missing ARIA, heading hierarchy issues)
   - SEO issues (missing meta, duplicate titles, broken structured data)
   - Performance red flags (large DOM, render-blocking patterns)

5. **Identify unknowns for stakeholder validation:**
   - Content that appears dynamic but source is unclear
   - Gated or authenticated sections
   - Third-party integrations visible in markup
</instructions>

<output_format>
## Site Audit Summary

### Page Classification
| Template Type | Count | Example URLs | Notes |
|---------------|-------|--------------|-------|

### Component Inventory
| Component Pattern | Frequency | Variations | XM Cloud Mapping |
|-------------------|-----------|------------|------------------|

### Content Structure Assessment
[Narrative summary with specific findings]

### Technical Debt & Concerns
| Issue | Severity | Affected Pages | Remediation Notes |
|-------|----------|----------------|-------------------|

### Stakeholder Questions
1. [Specific question requiring client input]
2. [...]
</output_format>

In parallel, I document whatever I can find about the existing marketing stack: CDP rules, personalization setup, current Sitecore content structure. The agents need that context to reason about integrations later.

What comes out: annotated sitemap, component frequency report, notes on what needs stakeholder confirmation.

From crawl data to requirements

This is where the crawl data turns into something useful. I feed the Playwright JSON into an agent prompt that generates component specs—one markdown file per component with props, datasource shape, analytics events, accessibility notes. I align the naming with XM Cloud’s component terminology so there’s no translation needed later.

Prompt I use for component spec generation:

<context>
You are a Sitecore XM Cloud technical analyst creating component specifications from site audit data. Your output will feed directly into the development backlog and XM Cloud template design.

Reference documentation:
- XM Cloud Components: https://doc.sitecore.com/xmc/en/users/xm-cloud/components.html
- Content SDK field types: https://doc.sitecore.com/xmc/en/developers/content-sdk/index-en.html
</context>

<task>
Generate a component specification document for each unique UI pattern identified in the site audit.
</task>

<inputs>
- Site audit component inventory (from recon phase)
- Sample URLs for each component pattern
- Project taxonomy (personas, journeys)
</inputs>

<instructions>
For each component pattern:

1. **Define the component:**
   - Name (PascalCase, XM Cloud-friendly)
   - Category (Hero, Navigation, Content Block, Card, Form, Media, etc.)
   - Purpose (one sentence)

2. **Specify fields for XM Cloud templates:**
   - Map visible content to Sitecore field types
   - Distinguish datasource fields from rendering parameters
   - Note required vs optional fields
   - Include validation hints for authors

3. **Document behavior:**
   - Interactive states (hover, active, expanded)
   - Responsive breakpoints
   - Loading/empty states
   - Accessibility requirements (ARIA, keyboard nav)

4. **Capture analytics requirements:**
   - Events to track (impressions, clicks, conversions)
   - Data attributes needed for tagging

5. **Note variants:**
   - Visual variants (themes, sizes)
   - Functional variants that might be separate components
</instructions>

<output_format>
# [ComponentName]

## Overview
- **Category:** [Category]
- **Purpose:** [One sentence description]
- **Source URLs:** [Sample URLs from crawl]

## Fields

| Field Name | Type | Required | Notes |
|------------|------|----------|-------|
| Title | Single-Line Text | Yes | Max 60 chars for SEO |
| Body | Rich Text | No | Supports links, lists |
| Image | Image | Yes | 16:9 aspect ratio |
| CTA | General Link | No | Button or text link |

## Rendering Parameters
| Parameter | Type | Default | Options |
|-----------|------|---------|---------|
| Theme | Droplist | "light" | light, dark, brand |

## Behavior
- **Responsive:** [Breakpoint behavior]
- **Accessibility:** [ARIA requirements]
- **States:** [Interactive states]

## Analytics
- **Impression:** Fire when component enters viewport
- **Click:** Track CTA clicks with destination URL

## Variants
- [List any visual or functional variants]

## Notes
- [Implementation considerations, edge cases]
</output_format>

For epics and acceptance criteria, I use a different prompt persona—more BA-flavored—that works through each persona and journey from the taxonomy. The key is making it cite sources from the knowledge base. That way I can trace every requirement back to something the client actually said or wrote.

For IA diagrams I usually use Gemini—it’s fast for visual drafts. Export as SVG, drop into the deck, move on.

What comes out: component matrix in the repo, IA diagrams, draft epics with acceptance criteria.

Building the backlog

Epics are nice, but you can’t price epics. I need stories with sizes.

I push the epics into Jira (or Azure DevOps) via their APIs and have an agent break them into user stories, tagging each one with a swim lane—composable build, Content Hub work, search integration, whatever categories make sense for the project. This keeps reporting clean later.

Prompt I use for story breakdown and sizing:

<context>
You are a Sitecore delivery lead breaking epics into estimable user stories for an XM Cloud project. You understand:
- XM Cloud development phases (content modeling, component build, integration, testing)
- Typical complexity drivers (custom integrations, personalization, content migration)
- Story sizing conventions (S/M/L mapped to approximate point ranges)

Sizing heuristics:
- **Small (S, 1-3 pts):** Single component, no integration, clear spec
- **Medium (M, 5-8 pts):** Multiple components OR simple integration OR template changes
- **Large (L, 13+ pts):** Complex integration, new patterns, significant unknowns → break down further
</context>

<task>
Break each epic into implementable user stories with preliminary size estimates.
</task>

<epic>
[Paste epic title and description]
</epic>

<instructions>
1. **Decompose the epic** into stories that can be completed in one sprint
2. **Assign swim lanes** to each story:
   - `xm-cloud-build`: Templates, components, serialization
   - `head-development`: Next.js components, styling, client-side logic
   - `content-hub`: DAM/CMP integration, content modeling
   - `search`: Sitecore Search widgets, indexing
   - `personalization`: CDP/Personalize rules, testing
   - `integration`: External APIs, data sync
   - `content-migration`: Content entry, migration scripts
   - `infrastructure`: Environments, CI/CD, monitoring

3. **Size each story** using the heuristics above
4. **Flag risks and dependencies** that affect estimates
5. **Identify stories needing stakeholder input** before sizing
</instructions>

<output_format>
## Epic: [Epic Title]

### Stories

| ID | Story Title | Swim Lane | Size | Dependencies | Notes |
|----|-------------|-----------|------|--------------|-------|
| S-001 | [As a... I want... so that...] | xm-cloud-build | M | None | |
| S-002 | [Story] | head-development | S | S-001 | Needs design sign-off |

### Risks & Unknowns
| Risk | Impact on Estimate | Mitigation |
|------|-------------------|------------|
| [Risk] | Could inflate S-002 to L | [Action] |

### Stories Requiring Clarification
- S-003: Need to confirm [specific question] before sizing

### Epic Total
- Story count: [N]
- Estimated range: [X-Y] points
- Confidence: High / Medium / Low
</output_format>

For sizing, the agent proposes S/M/L based on heuristics I’ve tuned over time. I review anything that touches compliance or complex integrations—those are where AI estimates go sideways. Dependencies go into a RAID log that both presales and delivery can see.

What comes out: a populated backlog with linked epics, rough sizes, and a RAID log.

Technical runway: environments and architecture

Before I can price anything, I need to nail down the infrastructure story. No surprises later.

I map out the Experience Edge setup—which endpoints for delivery vs preview, where API keys live in each environment, rotation schedule. The Experience Edge best practices doc is the reference here. I have an agent maintain a table of all this, but I sanity-check it manually.

For the head, I document the Next.js App Router + Content SDK pattern we’ll use. The Content SDK guide covers the basics, but I also capture project-specific constraints: CDN setup, identity provider, Content Hub integrations, Search JS SDK usage if it’s in scope.

What comes out: architecture diagram, environment/key matrix, technical assumptions doc.

Serialization and governance

This is the part that bites you if you skip it. Serialization boundaries need to be clear before anyone starts pulling content.

I draft the SCS module structure with an agent that references the SCS overview—which items go in which module, what allowedPushOperations to set, whether Deletes are ever okay. Then I have a DevOps-flavored prompt write the CLI scripts for ser pull, ser diff, and ser push with the right safeguards from the CLI reference.

Prompt I use for SCS module design:

<context>
You are a Sitecore DevOps engineer designing the serialization strategy for an XM Cloud project. You understand:
- SCS module boundaries and include/exclude patterns
- The difference between Foundation, Feature, and Project layers
- Safe vs dangerous push operations (CreateUpdateAndDelete vs CreateAndUpdate)
- Multi-environment serialization workflows

Reference: https://doc.sitecore.com/xmc/en/developers/xm-cloud/sitecore-content-serialization.html
</context>

<task>
Design the SCS module structure and serialization governance for the project.
</task>

<inputs>
- Component inventory (from requirements phase)
- Template hierarchy plan
- Environment list (dev, staging, prod)
- Team structure (who touches what)
</inputs>

<instructions>
1. **Define module boundaries:**
   - Group related items that change together
   - Separate volatile content from stable schema
   - Consider team ownership and deployment frequency

2. **Set appropriate push operations:**
   - `CreateAndUpdate` for most modules (safe default)
   - `CreateUpdateAndDelete` only where cleanup is needed AND items are dev-owned
   - Never allow deletes on content-heavy paths

3. **Design include/exclude rules:**
   - Include paths for each module
   - Exclude patterns for generated or volatile items
   - Handle language versions appropriately

4. **Document the workflow:**
   - When to pull vs push
   - Conflict resolution process
   - Environment promotion path
</instructions>

<output_format>
## SCS Module Structure

### Module Overview
| Module | Layer | Path | Push Operations | Owner |
|--------|-------|------|-----------------|-------|
| Foundation.Content | Foundation | /sitecore/templates/Foundation | CreateAndUpdate | Platform team |
| Feature.Navigation | Feature | /sitecore/templates/Feature/Navigation | CreateAndUpdate | Dev team |
| Project.Site | Project | /sitecore/templates/Project/Site | CreateAndUpdate | Dev team |
| Project.Site.Content | Project | /sitecore/content/Site | CreateAndUpdate | Content team |

### Module: [ModuleName]
```json
{
  "namespace": "[Namespace]",
  "items": {
    "includes": [
      {
        "name": "[name]",
        "path": "/sitecore/[path]",
        "allowedPushOperations": "CreateAndUpdate"
      }
    ]
  }
}

Serialization Workflow

Daily Development:

dotnet sitecore ser pull — before starting work
Make changes in XM Cloud
dotnet sitecore ser pull — capture changes
Commit to feature branch

Environment Promotion:

PR merged to main
CI runs dotnet sitecore ser push to staging
QA validation
Production deployment with --dry-run first

Governance Rules

Action	Requires Approval	Approver
New template	No	-
Template field deletion	Yes	Tech lead
Content tree restructure	Yes	Content + Tech lead
Push to production	Yes	Release manager

Safety Checklist

No CreateUpdateAndDelete on content paths
Language fallback configured correctly
Excluded: __Standard Values GUIDs, media blobs
CI validates serialization on every PR </output_format>


I also document the governance stuff that presales sometimes forgets to ask about: who signs off on schema changes, who rotates keys, what needs approval before it hits production.

**What comes out:** SCS module diagrams, CLI runbooks, governance RACI.

---

## Pricing

Now it's time to turn all this research into a number the client can say yes to.

I have an estimator prompt combine story sizes, environment work, and risk buffers into a forecast. But I always add a manual buffer—usually around 30%—because AI consistently underestimates organizational drag: approval cycles, content migration surprises, integration hiccups that nobody mentioned in the RFP.

**Prompt I use for estimate synthesis:**

You are a Sitecore delivery manager creating a project estimate for an XM Cloud implementation. You understand: - Typical XM Cloud project phases and their relative effort - Common risks that inflate estimates (integrations, content migration, organizational factors) - The difference between development effort and calendar time

Baseline velocity assumption: [X] story points per sprint per developer

Synthesize the backlog, architecture decisions, and risk assessment into a defensible project estimate. - Backlog summary: [Total stories, points by swim lane] - Architecture decisions: [Key technical choices affecting effort] - RAID log: [Active risks and their potential impact] - Team assumptions: [Team size, skill mix, availability] - Timeline constraints: [Hard deadlines, dependencies] 1. **Calculate base effort:** - Sum story points by swim lane - Apply velocity assumptions to get sprint count - Identify parallel vs sequential work streams

Add standard XM Cloud phases:
- Environment setup and CI/CD (typically 1-2 sprints)
- Content migration (varies widely—flag if unclear)
- UAT and bug fixing (typically 15-20% of dev effort)
- Go-live preparation and hypercare
Apply risk buffers:
- Integration risk: +10-25% for each external system
- Content migration risk: +15-30% if volume/complexity unclear
- Organizational risk: +10-20% for new-to-Sitecore clients
- Never let AI optimism remove these buffers
Generate scenarios:
- Optimistic: Everything goes smoothly (rare)
- Expected: Normal project friction
- Pessimistic: Key risks materialize

<output_format>

Estimate Summary

Effort Breakdown

Phase	Stories	Points	Sprints	Notes
Content Modeling	[N]	[N]	[N]	Templates, serialization
Component Build	[N]	[N]	[N]	BYOC, Storybook
Integration	[N]	[N]	[N]	[List systems]
Content Migration	[N]	[N]	[N]	[Volume estimate]
Testing & QA	-	-	[N]	20% of dev
Total	[N]	[N]	[N]

Risk Adjustments

Risk	Probability	Impact	Buffer Added
[Risk from RAID]	Medium	+2 sprints	1 sprint
Integration delays	High	+3 sprints	2 sprints

Scenarios

Scenario	Sprints	Calendar Weeks	Confidence
Optimistic	[N]	[N]	20%
Expected	[N]	[N]	60%
Pessimistic	[N]	[N]	20%

Assumptions

Team: [Size and composition]
Velocity: [Points per sprint]
Sprint length: [2 weeks]
Dependencies: [What must be true for this estimate to hold]

Excluded from Estimate

[Items explicitly out of scope]
[Client responsibilities]

Recommendation

[1-2 sentences on which scenario to quote and why] </output_format>

Always flag when you're uncertain. An honest "we need more discovery" is better than a confident wrong number. ```

The exec deck pulls together NotebookLM summaries, IA diagrams, and architecture decisions. Agent-drafted, but I edit it for tone—executives don’t want to read something that sounds like it was written by a bot.

Finally, I generate a scope appendix listing every artifact we’ll hand over: component repo, backlog export, RAID log, governance doc. This is what gets attached to the SOW.

What comes out: estimate model, exec presentation, scope appendix.

Day one

By the time the client signs, I’ve got:

A populated backlog and component repo
IA diagrams and architecture decisions documented
Environment and serialization setup ready for engineering
A knowledge base that new team members can query immediately

Sprint Zero proper starts by validating assumptions with stakeholders—I walk them through the crawl findings, the IA, the backlog structure. If we have time, I demo a basic Next.js proof of concept hitting Experience Edge Preview just to prove the pipes work.

The nice thing about having agents draft everything with citations is that I can trace each recommendation back to Sitecore’s own docs. Clients appreciate that. It’s not “the AI said so”—it’s “here’s the Sitecore guidance, here’s how we’re applying it.”

Is this worth it?

Honestly, it depends. The setup cost is real—building the RAG workspace, tuning the prompts, wiring up the crawlers. If you’re doing one XM Cloud project and never touching it again, probably not worth it.

But if you’re doing this repeatedly, the compounding returns are significant. Each project refines the prompts and templates. The agents get better at generating useful artifacts. And I spend less time on formatting and more time on the work that actually needs a human.

It’s not magic. The agents still make mistakes, still miss context, still need supervision. But they’ve cut my Sprint Zero prep time roughly in half, and the artifacts I hand off are more consistent than when I was doing everything manually.

Useful links

Azure OpenAI – Use your data concepts: https://learn.microsoft.com/azure/ai-services/openai/how-to/use-your-data
XM Cloud – Experience Edge best practices: https://doc.sitecore.com/xmc/en/developers/xm-cloud/experience-edge-best-practices.html
XM Cloud – Next.js App Router Content SDK guide: https://doc.sitecore.com/xmc/en/developers/content-sdk/getting-started-with-next-js-app-router-using-content-sdk.html
Sitecore Content Serialization overview: https://doc.sitecore.com/xmc/en/developers/xm-cloud/sitecore-content-serialization.html
Sitecore CLI reference: https://doc.sitecore.com/xmc/en/developers/xm-cloud/the-sitecore-command-line-interface.html
Components in XM Cloud: https://doc.sitecore.com/xmc/en/users/xm-cloud/components.html

Related posts in the AI-Powered Stack series: