back to posts
#44 Part 7 2026-01-15 8 min

When RSS Parsing Ships: The Last Manual Step Disappears

How RSS-based auto-indexing will complete the fully automated publishing pipeline

When RSS Parsing Ships: The Last Manual Step Disappears

There’s one manual step left in the pipeline: clicking “Sync” in the AutoRAG dashboard. When RSS parsing ships, even that goes away.


The Current State

Right now, the flow is:

/publish-post


Git push → Cloudflare Pages deploys → Sitemap updates


[MANUAL] Click "Sync" in AI Search dashboard


AutoRAG crawls → Indexes new content → Chatbot knows about it

That manual sync step is the last friction point. It takes 10 seconds, but it requires:

For true “publish and forget,” this needs to be automatic.


What’s Coming: RSS Auto-Indexing

Cloudflare is adding RSS feed support to AutoRAG. Instead of scheduled sitemap crawls, it will:

  1. Subscribe to your RSS feed
  2. Detect new entries automatically
  3. Index new content immediately
  4. No manual trigger required

The future flow:

/publish-post


Git push → Cloudflare Pages deploys

    ├─→ Sitemap updates

    └─→ RSS feed updates with new entry


    AutoRAG detects RSS update


    Auto-indexes new content


    Chatbot immediately knows about new post

Zero manual steps. Publish and forget.


Why RSS Over Sitemap

Sitemap limitations:

RSS advantages:

The sitemap says “here’s everything.” The RSS feed says “here’s what’s new.”


Astro RSS Setup

Astro has built-in RSS support:

// src/pages/rss.xml.js
import rss from '@astrojs/rss';
import { getCollection } from 'astro:content';

export async function GET(context) {
  const posts = await getCollection('posts');

  return rss({
    title: 'Operational Semantics',
    description: 'Building AI infrastructure in public',
    site: context.site,
    items: posts.map((post) => ({
      title: post.data.title,
      pubDate: new Date(post.data.date),
      description: post.data.description,
      link: `/posts/${post.slug}/`,
    })),
  });
}

This generates /rss.xml on every build. New posts automatically appear in the feed.


The Complete Automated Flow

Once RSS parsing ships:

┌─────────────────────────────────────────────────────────────┐
│                    WRITE & PUBLISH                           │
│                                                              │
│  You: /publish-post "New Post Title"                        │
│                                                              │
│  PAI: ✓ Post created                                        │
│       ✓ Frontmatter valid                                   │
│       ✓ Build succeeds                                      │
│       ✓ Git pushed                                          │
│       ✓ Substack published                                  │
│                                                              │
│  Done. Go do something else.                                │
└─────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│              AUTOMATIC (NO HUMAN INVOLVED)                   │
│                                                              │
│  1. Cloudflare Pages detects push                           │
│  2. Build runs (~45 seconds)                                │
│  3. Site deploys to edge                                    │
│  4. RSS feed regenerates with new entry                     │
│  5. AutoRAG detects RSS update                              │
│  6. New post fetched and chunked                            │
│  7. Embeddings generated                                    │
│  8. Vectors stored in Vectorize                             │
│  9. Chatbot can answer questions about new post             │
│                                                              │
│  Time: ~2-3 minutes                                         │
└─────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│                    VERIFICATION                              │
│                                                              │
│  Reader: "What's the latest post about?"                    │
│                                                              │
│  Chatbot: "The most recent post is 'New Post Title'         │
│            published today. It covers..."                    │
│                                                              │
└─────────────────────────────────────────────────────────────┘

From “publish” to “chatbot knows” - completely hands-off.


What This Enables

1. True “Publish and Forget”

Write in the morning. Publish before lunch. By afternoon, the chatbot knows about it. No dashboard visits required.

2. Faster Knowledge Updates

Currently: Scheduled sync every few hours, or manual trigger. With RSS: Updates within minutes of publishing.

3. Multi-Platform Sync

RSS isn’t just for AutoRAG. The same feed can:

4. Audit Trail

The RSS feed is a timestamped record of what was published when. Useful for debugging “why doesn’t the chatbot know about X?”


Preparing for RSS Parsing

Even before the feature ships, you can prepare:

1. Set up the RSS endpoint

Add @astrojs/rss and create src/pages/rss.xml.js. The feed generates on every build.

2. Verify the feed

curl https://operationalsemantics.dev/rss.xml | head -20

Make sure it includes all posts with correct dates and descriptions.

3. Add to robots.txt (optional)

Sitemap: https://operationalsemantics.dev/sitemap-index.xml
RSS: https://operationalsemantics.dev/rss.xml

4. Wait for Cloudflare

When RSS support ships in AI Search, you’ll be ready to enable it immediately.


The Vision: Reactive Publishing

The end state is reactive publishing:

TriggerResponse
Git pushSite deploys
Site deploysRSS updates
RSS updatesAI indexes
AI indexesChatbot learns

No cron jobs. No scheduled tasks. No manual syncs.

Events propagate through the system automatically. You publish; everything else reacts.


Fallback: Scheduled Sync

If RSS parsing doesn’t ship, or you prefer the current approach:

Scheduled sitemap sync still works:

  1. AI Search Settings → Indexing
  2. Set sync schedule (hourly, daily, etc.)
  3. System automatically re-crawls on schedule

It’s slightly less real-time, but still hands-off.


Summary

StateManual StepsTime to Indexed
Current (sitemap + manual sync)1 (click sync)5-10 minutes
Scheduled sync0Up to X hours
RSS auto-index02-3 minutes

The goal is zero friction between “I wrote something” and “readers can ask about it.”

RSS parsing is the last piece.


Next up: Lessons and Gotchas - Everything that broke along the way.