AI Workflows — Editorial copilot for XM Cloud pages with "on your data" AI

Generic AI suggestions don’t work for enterprise content. When an editor asks for help rewriting a product description or localizing a landing page, the AI needs to understand your brand voice, product terminology, and content standards—not hallucinate generic marketing copy.

This post walks through building an editorial copilot for Sitecore XM Cloud that grounds all suggestions in your own content and guidelines using Azure OpenAI “On Your Data” with Azure AI Search. The focus is on practical implementation: how to structure the knowledge base, configure retrieval for quality results, design prompts that editors can actually use, and wire in human-in-the-loop approval before anything touches live content.

Architecture overview

The copilot sits between XM Cloud and Azure OpenAI, using a RAG (Retrieval-Augmented Generation) pattern to ground every suggestion in your own content:

Key design decisions:

Experience Edge Preview API fetches unpublished content so editors can get suggestions on drafts
Azure AI Search with hybrid search (vector + keyword + semantic ranking) retrieves relevant brand guidelines and reference content
Azure OpenAI “On Your Data” handles the RAG orchestration, citation generation, and response grounding
Human approval is mandatory before any suggestions are applied back to XM Cloud

Building the knowledge base

The quality of suggestions depends entirely on what you put in the knowledge base. I index three categories of content:

1. Brand and style guidelines

These are the non-negotiable rules for how content should sound:

Brand voice documentation (tone, personality, values)
Writing style guides (sentence length, active voice, terminology)
Product naming conventions and trademark usage
Legal disclaimers and required statements

2. Reference content

High-quality examples that demonstrate what “good” looks like:

Published pages that represent your best work
Approved marketing copy and campaign content
Product descriptions that have been through legal review
Localized content that passed native speaker review

3. Terminology and glossary

Domain-specific vocabulary the model needs to get right:

Product names and feature terminology
Industry jargon and acronyms
Competitor mentions and how to handle them
Translation glossaries for localized content

Chunking strategy

Azure AI Search’s default chunking (1,024 tokens) works for most content, but editorial content benefits from adjustments:

# Recommended chunking for editorial content
CHUNK_CONFIG = {
    "brand_guidelines": {
        "chunk_size": 512,      # Smaller chunks for dense rules
        "overlap": 128,         # 25% overlap for context
    },
    "reference_content": {
        "chunk_size": 1024,     # Larger chunks for full examples
        "overlap": 256,         # More overlap to preserve context
    },
    "glossary": {
        "chunk_size": 256,      # Small chunks, one term per chunk
        "overlap": 0,           # No overlap needed
    }
}

For brand guidelines, smaller chunks (512 tokens) with 25% overlap ensure individual rules don’t get split across chunks. For reference content, larger chunks (1,024 tokens) preserve enough context for the model to understand complete examples.

Indexing with Azure AI Search

Use integrated vectorization with the Document Layout skill for semantic chunking:

{
  "name": "editorial-copilot-index",
  "fields": [
    {"name": "id", "type": "Edm.String", "key": true},
    {"name": "content", "type": "Edm.String", "searchable": true},
    {"name": "content_vector", "type": "Collection(Edm.Single)", "dimensions": 1536, "vectorSearchProfile": "default"},
    {"name": "category", "type": "Edm.String", "filterable": true},
    {"name": "locale", "type": "Edm.String", "filterable": true},
    {"name": "product_line", "type": "Edm.String", "filterable": true},
    {"name": "last_updated", "type": "Edm.DateTimeOffset", "filterable": true}
  ],
  "vectorSearch": {
    "profiles": [{
      "name": "default",
      "algorithm": "hnsw",
      "vectorizer": "text-embedding-ada-002"
    }]
  },
  "semantic": {
    "configurations": [{
      "name": "semantic-config",
      "prioritizedFields": {
        "contentFields": [{"fieldName": "content"}]
      }
    }]
  }
}

The category, locale, and product_line fields enable filtered retrieval—when an editor is working on a German product page, the copilot retrieves German brand guidelines and product-specific terminology.

Fetching content from Experience Edge

The copilot needs to read the current content before suggesting improvements. Experience Edge Preview API provides access to unpublished drafts.

GraphQL query for page content

query GetPageContent($itemId: String!, $language: String!) {
  item(path: $itemId, language: $language) {
    id
    name
    path
    template {
      name
    }
    fields {
      name
      value
      ... on RichTextField {
        value
      }
      ... on TextField {
        value
      }
    }
    children(first: 50) {
      results {
        id
        name
        template {
          name
        }
        fields {
          name
          value
        }
      }
    }
  }
}

Python client for Experience Edge

import httpx
from typing import Optional
from pydantic import BaseModel

class XMCloudClient:
    def __init__(self, edge_url: str, api_key: str):
        self.edge_url = edge_url
        self.api_key = api_key

    async def fetch_item(self, item_id: str, language: str = "en") -> dict:
        query = """
        query GetItem($id: String!, $lang: String!) {
          item(path: $id, language: $lang) {
            id
            name
            path
            fields { name value }
          }
        }
        """
        async with httpx.AsyncClient(timeout=30) as client:
            response = await client.post(
                self.edge_url,
                json={
                    "query": query,
                    "variables": {"id": item_id, "lang": language}
                },
                headers={
                    "sc_apikey": self.api_key,
                    "Content-Type": "application/json"
                }
            )
            response.raise_for_status()
            return response.json()["data"]["item"]

    def extract_text_fields(self, item: dict) -> dict[str, str]:
        """Extract text content from item fields for copilot processing."""
        text_fields = {}
        for field in item.get("fields", []):
            name = field["name"]
            value = field.get("value", "")
            # Skip system fields and empty values
            if name.startswith("__") or not value:
                continue
            # Strip HTML for plain text processing
            text_fields[name] = self._strip_html(value)
        return text_fields

    def _strip_html(self, html: str) -> str:
        """Basic HTML stripping for text extraction."""
        import re
        clean = re.sub(r'<[^>]+>', ' ', html)
        return ' '.join(clean.split())

The Experience Edge Preview API has a rate limit of 80 requests per second for uncached responses. For batch operations, implement request queuing or use the Delivery API for published content.

Configuring Azure OpenAI “On Your Data”

The “On Your Data” feature handles the RAG pipeline: intent generation, retrieval, filtering, reranking, and response generation with citations.

Key configuration parameters

from openai import AzureOpenAI

client = AzureOpenAI(
    azure_endpoint="https://your-instance.openai.azure.com",
    api_key="your-api-key",
    api_version="2024-02-01"
)

def generate_suggestion(
    current_content: str,
    instruction: str,
    filters: dict = None
) -> dict:
    """Generate grounded editorial suggestions."""

    # Build filter expression for targeted retrieval
    filter_expr = None
    if filters:
        conditions = []
        if filters.get("locale"):
            conditions.append(f"locale eq '{filters['locale']}'")
        if filters.get("product_line"):
            conditions.append(f"product_line eq '{filters['product_line']}'")
        if conditions:
            filter_expr = " and ".join(conditions)

    response = client.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": f"{instruction}\n\nCurrent content:\n{current_content}"}
        ],
        extra_body={
            "data_sources": [{
                "type": "azure_search",
                "parameters": {
                    "endpoint": "https://your-search.search.windows.net",
                    "index_name": "editorial-copilot-index",
                    "authentication": {
                        "type": "api_key",
                        "key": "your-search-key"
                    },
                    "query_type": "vector_semantic_hybrid",
                    "semantic_configuration": "semantic-config",
                    "strictness": 3,           # 1-5, higher = stricter filtering
                    "top_n_documents": 5,      # Number of chunks to include
                    "in_scope": True,          # Only answer from retrieved docs
                    "filter": filter_expr
                }
            }]
        },
        temperature=0.3  # Lower temperature for consistent, grounded output
    )

    return {
        "suggestion": response.choices[0].message.content,
        "citations": extract_citations(response),
        "model": response.model
    }

def extract_citations(response) -> list[dict]:
    """Extract source citations from the response."""
    citations = []
    if hasattr(response.choices[0].message, 'context'):
        for citation in response.choices[0].message.context.get('citations', []):
            citations.append({
                "content": citation.get("content"),
                "title": citation.get("title"),
                "url": citation.get("url")
            })
    return citations

Tuning retrieval quality

The strictness and top_n_documents parameters are critical:

Parameter	Low Value	High Value	Editorial Use Case
`strictness`	1-2: More documents, some irrelevant	4-5: Fewer documents, highly relevant	Start at 3, increase if getting off-topic suggestions
`top_n_documents`	3-5: Focused context	10-15: Broader context	Use 5 for rewrites, 10 for comprehensive style checks

If the copilot isn’t finding relevant guidelines, reduce strictness. If it’s hallucinating or citing irrelevant content, increase strictness and verify your index contains the right documents.

Query type selection

Azure AI Search offers multiple query types:

keyword: Basic text matching, fast but misses semantic similarity
vector: Embedding-based search, good for semantic similarity
semantic: Adds AI-powered reranking to keyword search
vector_semantic_hybrid: Combines all three—recommended for editorial use

Hybrid search catches both exact terminology matches (critical for product names) and semantically similar content (finding relevant examples even with different wording).

Prompt templates for editorial tasks

Each editorial action gets a dedicated prompt template. The system prompt establishes the copilot’s role and constraints:

System prompt

SYSTEM_PROMPT = """You are an editorial assistant for Sitecore XM Cloud content. Your role is to help editors improve content while maintaining brand voice and accuracy.

CONSTRAINTS:
- Base all suggestions on the retrieved brand guidelines and reference content
- Never invent product features, pricing, or claims not in the source material
- Preserve all product names, trademarks, and legal disclaimers exactly as written
- If you cannot find relevant guidance in the retrieved documents, say so explicitly
- Always cite which guidelines or examples informed your suggestions

OUTPUT FORMAT:
- Provide the revised content first
- Then list key changes with brief rationale
- Include citations in [Source: document name] format"""

Task-specific prompts

PROMPT_TEMPLATES = {
    "rewrite_clarity": """Rewrite the following content for improved clarity and readability.

Requirements:
- Maintain the same meaning and key messages
- Use shorter sentences where possible (aim for 15-20 words)
- Use active voice
- Keep the same approximate length

Current content:
{content}""",

    "adjust_tone": """Adjust the tone of this content to be more {target_tone}.

Target tone: {target_tone}
Target audience: {audience}

Maintain:
- All factual claims and product details
- Trademark and legal language
- Overall message and call to action

Current content:
{content}""",

    "shorten": """Shorten this content to approximately {target_length} characters while preserving the key message.

Priority order for what to keep:
1. Main value proposition
2. Key differentiators
3. Call to action
4. Supporting details

Current content ({current_length} characters):
{content}""",

    "localize": """Adapt this content for {target_locale} market.

Requirements:
- Translate to {target_language}
- Keep product names in English unless locale-specific names exist in guidelines
- Adapt idioms and cultural references appropriately
- Maintain brand voice as defined in localized guidelines
- Flag any content that may need legal review for this market

Source content ({source_locale}):
{content}""",

    "seo_optimize": """Optimize this content for the target keyword while maintaining readability.

Target keyword: {keyword}
Secondary keywords: {secondary_keywords}

Requirements:
- Include target keyword in first 100 characters if natural
- Use keyword variations, not repetition
- Maintain brand voice and clarity
- Suggest a meta description (150-160 characters)

Current content:
{content}"""
}

Few-shot examples for consistent output

For critical tasks like localization, include examples in the prompt:

LOCALIZATION_EXAMPLES = """
Example 1:
Source (en-US): "Get started with a free trial today!"
Target (de-DE): "Starten Sie noch heute mit einer kostenlosen Testversion!"
Note: Formal "Sie" used as per German brand guidelines.

Example 2:
Source (en-US): "XM Cloud powers your digital experiences."
Target (de-DE): "XM Cloud unterstützt Ihre digitalen Erlebnisse."
Note: Product name "XM Cloud" kept in English as per terminology guidelines.
"""

Human-in-the-loop approval workflow

Under the EU AI Act and enterprise governance requirements, human oversight is mandatory for content that will be published. The copilot generates suggestions; humans approve them.

Approval flow

Audit logging

Every interaction must be logged for compliance:

from datetime import datetime
from pydantic import BaseModel
from typing import Optional

class AuditEntry(BaseModel):
    timestamp: datetime
    user_id: str
    action: str  # "generate", "accept", "reject", "modify"
    item_id: str
    field_name: str
    original_content: str
    suggested_content: str
    final_content: Optional[str]
    citations: list[dict]
    model_version: str
    rejection_reason: Optional[str]

async def log_copilot_action(entry: AuditEntry, db: Database):
    """Record copilot interaction for audit trail."""
    await db.audit_log.insert_one(entry.model_dump())

Applying changes to XM Cloud

When editors approve suggestions, apply them as draft updates:

async def apply_suggestion(
    item_id: str,
    field_name: str,
    new_value: str,
    xm_client: XMCloudManagementClient,
    audit_db: Database
) -> dict:
    """Apply approved suggestion to XM Cloud as a new draft version."""

    # Create new draft version (never overwrite published)
    result = await xm_client.update_item_field(
        item_id=item_id,
        field_name=field_name,
        value=new_value,
        version="draft"
    )

    # Log the application
    await audit_db.applied_changes.insert_one({
        "item_id": item_id,
        "field_name": field_name,
        "applied_at": datetime.utcnow(),
        "applied_by": get_current_user(),
        "draft_version": result["version"]
    })

    return result

Deployment options

Sidecar service (recommended for start)

Deploy the copilot as a standalone FastAPI service:

from fastapi import FastAPI, HTTPException, Depends
from fastapi.security import HTTPBearer

app = FastAPI(title="XM Cloud Editorial Copilot")
security = HTTPBearer()

@app.post("/suggestions/rewrite")
async def suggest_rewrite(
    request: RewriteRequest,
    token: str = Depends(security)
):
    """Generate rewrite suggestions for XM Cloud content."""
    # Validate user permissions
    user = await validate_token(token)
    if not user.can_use_copilot:
        raise HTTPException(403, "Copilot access not authorized")

    # Fetch current content from Experience Edge
    xm_client = XMCloudClient(settings.edge_url, settings.edge_api_key)
    item = await xm_client.fetch_item(request.item_id, request.language)
    current_content = item["fields"].get(request.field_name, "")

    # Generate grounded suggestion
    suggestion = await generate_suggestion(
        current_content=current_content,
        instruction=PROMPT_TEMPLATES["rewrite_clarity"].format(content=current_content),
        filters={"locale": request.language, "product_line": request.product_line}
    )

    # Log the generation
    await log_copilot_action(AuditEntry(
        timestamp=datetime.utcnow(),
        user_id=user.id,
        action="generate",
        item_id=request.item_id,
        field_name=request.field_name,
        original_content=current_content,
        suggested_content=suggestion["suggestion"],
        citations=suggestion["citations"],
        model_version=suggestion["model"]
    ))

    return suggestion

Integration with Sitecore Stream

For organizations using Sitecore Stream (the native AI platform), the copilot can complement Stream’s capabilities:

Stream Brand Assistant: Use for general content generation and brand research
Custom copilot: Use for specialized editorial workflows with your own knowledge base

The custom copilot provides more control over retrieval sources, prompt engineering, and approval workflows than the out-of-box Stream features.

Measuring copilot effectiveness

Track both usage metrics and quality metrics:

Usage metrics

Suggestions generated per day/week
Acceptance rate (accepted vs. rejected)
Modification rate (how often editors change suggestions before accepting)
Time from suggestion to approval

Quality metrics

Citation accuracy (are sources correctly attributed?)
Brand voice consistency (manual review sample)
Error rate (suggestions that violate guidelines)

async def calculate_copilot_metrics(db: Database, date_range: tuple) -> dict:
    """Calculate copilot effectiveness metrics."""
    start, end = date_range

    pipeline = [
        {"$match": {"timestamp": {"$gte": start, "$lte": end}}},
        {"$group": {
            "_id": "$action",
            "count": {"$sum": 1}
        }}
    ]

    results = await db.audit_log.aggregate(pipeline).to_list(None)
    counts = {r["_id"]: r["count"] for r in results}

    total_generated = counts.get("generate", 0)
    accepted = counts.get("accept", 0)
    modified = counts.get("modify", 0)
    rejected = counts.get("reject", 0)

    return {
        "total_suggestions": total_generated,
        "acceptance_rate": (accepted + modified) / total_generated if total_generated else 0,
        "modification_rate": modified / (accepted + modified) if (accepted + modified) else 0,
        "rejection_rate": rejected / total_generated if total_generated else 0
    }

Troubleshooting common issues

Suggestions ignore brand guidelines

Check indexing: Verify brand guideline documents are in the search index
Reduce strictness: Lower the strictness parameter to include more documents
Add filters: Use category filters to prioritize brand content
Improve chunking: Ensure guidelines aren’t split across chunks

Model hallucinates product details

Set in_scope: true: Force model to only use retrieved content
Increase strictness: Filter out tangentially related content
Add explicit constraints: Include “do not invent details” in system prompt
Upgrade model: GPT-4 follows grounding constraints better than GPT-3.5

Slow response times

Check index size: Large indexes slow retrieval
Reduce top_n_documents: Fewer documents = faster responses
Use caching: Cache repeated queries for same content
Optimize filters: Indexed filter fields are faster than content filters